-
Notifications
You must be signed in to change notification settings - Fork 6
/
Copy pathREADME.Rmd
110 lines (75 loc) · 3.19 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r setup, include = FALSE}
knitr::opts_chunk$set(
message = FALSE,
warning = FALSE,
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
set.seed(27)
```
# vsp
<!-- badges: start -->
[![Codecov test coverage](https://codecov.io/gh/RoheLab/vsp/branch/main/graph/badge.svg)](https://app.codecov.io/gh/RoheLab/vsp?branch=main)
[![CRAN status](https://www.r-pkg.org/badges/version/vsp)](https://CRAN.R-project.org/package=vsp)
[![R-CMD-check](https://github.com/RoheLab/vsp/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/RoheLab/vsp/actions/workflows/R-CMD-check.yaml)
<!-- badges: end -->
The goal of `vsp` is to enable fast, spectral estimation of latent factors in random dot product graphs. Under mild assumptions, the `vsp` estimator is consistent for (degree-corrected) stochastic blockmodels, (degree-corrected) mixed-membership stochastic blockmodels, and degree-corrected overlapping stochastic blockmodels.
More generally, the `vsp` estimator is consistent for random dot product graphs that can be written in the form
```
E(A) = Z B Y^T
```
where `Z` and `Y` satisfy the varimax assumptions of [1]. `vsp` works on directed and undirected graphs, and on weighted and unweighted graphs. Note that `vsp` is a semi-parametric estimator.
## Installation
You can install the released version of `vsp` from CRAN with
``` r
install.packages("vsp")
```
You can install the development version of `vsp` with:
``` r
install.packages("devtools")
devtools::install_github("RoheLab/vsp")
```
## Example
Obtaining estimates from `vsp` is straightforward. We recommend representing networks as [`igraph`](https://igraph.org/r/) objects or sparse adjacency matrices using the [`Matrix`](https://cran.r-project.org/package=Matrix) package. Once you have your network in one of these formats, you can get estimates by calling the `vsp()` function. The result is a `vsp_fa` S3 object.
Here we demonstrate `vsp` usage on an `igraph` object, using the `enron` network from `igraphdata` package to demonstrate this functionality. First we peak at the graph:
```{r}
library(igraph)
data(enron, package = "igraphdata")
image(sign(get.adjacency(enron, sparse = FALSE)))
```
Now we estimate:
```{r}
library(vsp)
fa <- vsp(enron, rank = 30)
fa
```
```{r}
get_varimax_z(fa)
```
To visualize a screeplot of the singular value, use:
```{r}
screeplot(fa)
```
At the moment, we also enjoy using pairs plots of the factors as a diagnostic measure:
```{r}
plot_varimax_z_pairs(fa, 1:5)
```
```{r}
plot_varimax_y_pairs(fa, 1:5)
```
Similarly, an IPR pairs plot can be a good way to check for singular vector localization (and thus overfitting!).
```{r}
plot_ipr_pairs(fa)
```
```{r}
plot_mixing_matrix(fa)
```
## References
[1] Rohe, Karl, and Muzhe Zeng. “Vintage Factor Analysis with Varimax Performs Statistical Inference.” Journal of the Royal Statistical Society Series B: Statistical Methodology 85, no. 4 (September 29, 2023): 1037–60. https://doi.org/10.1093/jrsssb/qkad029.
Code to reproduce the results from the paper is [available here](https://github.com/RoheLab/vsp-paper).