Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixed small bug and removed complex examples from vignettes #160

Merged
merged 1 commit into from
Oct 6, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 10 additions & 9 deletions R/galamm.R
Original file line number Diff line number Diff line change
Expand Up @@ -385,13 +385,17 @@ galamm <- function(formula, weights = NULL, data, family = gaussian,
if (length(lambda_inds) > 1) {
pars <- c(1, opt$par[lambda_inds])

gobj$lmod$reTrms$Zt@x <- if (is.null(factor_interactions)) {
pars[lambda_mappings$lambda_mapping_Zt + 2L]
if (is.null(factor_interactions)) {
# Must take into account the possible existence of zero
inds <- lambda_mappings$lambda_mapping_Zt + 2L
gobj$lmod$reTrms$Zt@x[inds == 0] <- 0
gobj$lmod$reTrms$Zt@x[inds != 0] <- pars[inds]
} else {
as.numeric(Map(function(l, x) sum(pars[l + 2L] * x),
l = lambda_mappings$lambda_mapping_Zt,
x = lambda_mappings$lambda_mapping_Zt_covs
))
gobj$lmod$reTrms$Zt@x <-
as.numeric(Map(function(l, x) sum(pars[l + 2L] * x),
l = lambda_mappings$lambda_mapping_Zt,
x = lambda_mappings$lambda_mapping_Zt_covs
))
}
}

Expand Down Expand Up @@ -484,8 +488,6 @@ galamm <- function(formula, weights = NULL, data, family = gaussian,
lambda_interaction_inds <- lambda_standard_inds <- lambda_inds
}



ret$parameters <- list(
beta_inds = beta_inds,
dispersion_parameter = final_model$phi,
Expand All @@ -511,6 +513,5 @@ galamm <- function(formula, weights = NULL, data, family = gaussian,

ret$gam <- gamm4.wrapup(gobj, ret, final_model)


ret
}
2 changes: 0 additions & 2 deletions R/gamm4-functions.R
Original file line number Diff line number Diff line change
Expand Up @@ -289,9 +289,7 @@ gamm4.wrapup <- function(gobj, ret, final_model) {
Xfp[, ind] <- gobj$G$Xf[, ind, drop = FALSE] %*% B[ind, ind, drop = FALSE]
}


object$coefficients <- p

vr <- VarCorr(ret)

scale <- as.numeric(attr(vr, "sc"))^2
Expand Down
7 changes: 1 addition & 6 deletions cran-comments.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ latent variable models, often used in psychometrics.

## Test environments

* Windows, r-devel, r-release, and r-oldrelease
* Windows, r-devel and r-release
* Local Apple Silicon M1, on R4.2.1
* R-CMD-check via GitHub Actions on windows-latest, macOS-latest, ubuntu-20.04 (release), and ubuntu-20.04 (devel).

Expand All @@ -32,8 +32,3 @@ Possibly mis-spelled words in DESCRIPTION:
et (11:42)
nonlinearly (14:5)

Found the following (possibly) invalid URLs:
URL: https://doi.org/10.3102/1076998611417628
From: inst/doc/galamm.html
Status: 503
Message: Service Unavailable
203 changes: 0 additions & 203 deletions vignettes-raw/semiparametric.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -268,207 +268,4 @@ confint(mod, parm = "lambda")



### Multivariate Gaussian Model

We now do a joint analysis of domain 1 and domain 3, for which all item responses are conditionally normally distributed.

Letting $\eta_{1}$ denote latent ability in domain 1 and $\eta_{3}$ denote latent ability in domain 3, and $\lambda_{i1}$ and $\lambda_{i3}$ be corresponding factor loadings for the $i$th item measuring each domain, the measurement model is now

$$
y_{i} = \beta_{ij} + \lambda_{ij} \eta_{j} + \epsilon_{ij} ~ j=1,3
$$

To avoid unnecessary complexity, we assume the residual standard deviation is the same for all responses. The [vignette on linear mixed models with heteroscedastic residuals](https://lcbc-uio.github.io/galamm/articles/lmm_heteroscedastic.html) shows how this assumption can be relaxed. The data were also simulated with all residual standard deviations equal, so in this case the homoscedasticity assumption is satisfied.

In the structural model, we have a smooth term for the relationship between the latent trait and x, and we have random intercepts for a given timepoint within subject $\zeta^{(2)}$, and for a given subject across timepoints $\zeta^{(3)}$.

$$
\eta_{j} = h_{j}(x) + \zeta_{j}^{(2)} + \zeta_{j}^{(3)} ~j=1,3.
$$

We first subset the cognition dataset to get the measurements of domain 1 and 3, which are conditionally normally distributed. We also create two dummy variable, `domain1` and `domain3`, which we need when defining the formulas below.

```{r}
dat <- subset(cognition, domain %in% c(1, 3))
dat <- cbind(
dat,
model.matrix(~ 0 + domain, data = dat)[, c("domain1", "domain3")]
)
```

Below is a plot of the data we are analyzing.

```{r, semiparametric-multivariate-spaghetti, fig.cap="Spaghetti plot of measurements of domain 1 and 3."}
ggplot(dat, aes(x = x, y = y, group = id)) +
geom_point(size = .1) +
geom_line(linewidth = .1, alpha = .3) +
facet_wrap(
vars(item), scales = "free_y",
labeller = as_labeller(function(x) paste("Domain", substr(x, 1, 1),
"item", substr(x, 2, 2)))) +
theme(strip.background = element_blank(),
panel.grid = element_blank())
```

Mathematically, the factor loading matrix we want is

$$
\Lambda =
\begin{pmatrix}
1 & 0 \\
\lambda_{12} & 0 \\
\lambda_{13} & 0 \\
0 & 1 \\
0 & \lambda_{22} \\
0 & \lambda_{23} \\
0 & \lambda_{24}
\end{pmatrix}
$$

In code, that becomes:

```{r}
(lmat <- matrix(c(1, NA, NA, 0, 0, 0, 0,
0, 0, 0, 1, NA, NA, NA), ncol = 2))
```

Below is the call to fit the model.

```{r}
mod <- galamm(
formula = y ~ domain +
sl(x, k = 6, by = domain, load.var = c("ability1", "ability3")) +
(0 + domain1:ability1 + domain3:ability3 | id) +
(0 + domain1:ability1 | id:timepoint) +
(0 + domain3:ability3 | id:timepoint),
data = dat,
load.var = "item",
lambda = list(lmat),
factor = list(c("ability1", "ability3"))
)
```

Before showing the results, some comments are probably useful. In the `formula`, the first term `domain` represents an intercept per domain. Ideally we should have had an intercept per item, representing the item bias, but in this case all item biases were set to zero when simulating the data, and thus estimating an intercept for all of them is unnecessary. Next, the term `sl(x, k = 6, by = domain, load.var = c("ability1", "ability3"))` specifies that we want one smooth estimated independently for each level of the factor variable `domain`. For the two levels of the factor variable `domain`, the loadings `ability1` and `ability3` should be applied, respectively. These represent the two columns in the loading matrix above. Setting `k = 6` specifies that we want six basis functions for each of the smooth terms. The first random effects term `(0 + domain1:ability1 + domain3:ability3 | id)` corresponds to $\lambda \zeta_{j}^{(3)}$, and by specifying them in a single term, we allow the level-3 random intercepts to be correlated. The next two random effect terms `(0 + domain1:ability1 | id:timepoint)` and `(0 + domain3:ability3 | id:timepoint)` correspond to $\lambda \zeta_{j}^{(2)}$, and by specifying them independently we state that the level-2 random intercepts should be uncorrelated. The assumption of uncorrelated random intercepts at level 2 is reasonable here because we simulated the data this way. If correlations at both levels should be allowed, we could simply replace all the three random effects terms with the single term `(0 + domain1:ability1 + domain3:ability3 | id / timepoint)`.

We can now look at the model output, which shows that the factor loadings are estimated very close to their true value, and that the level-3 random intercepts are estimated to have correlation of 0.61 which is actually a reasonable approximation of the true value 0.4, since random effect correlations are notoriously hard to estimate.

```{r}
summary(mod)
```

We now study the smooth terms. Since in this case we know the true values, we include these for comparison. First we look at the smooth for domain 1, and we see that the estimate very well captures the true value.

```{r, semiparametric-multivariate-gaussian-smooth1, fig.cap="Comparison of estimate to true value."}
# x-values to plot the true function
x <- seq(from = 0, to = 1, by = .01)
# True function
f0 <- function(x) 2 * sin(pi * x)
# Scale to mean zero across x
y <- f0(x) - mean(f0(x))

# Plot estimate
plot_smooth(mod, select = 1, scale = 0)
# Overlay true curve in red
lines(x, y, col = "red")

```

Next we look at domain 3. In this case we see that our estimate is oversmoothing the true value. This is probably because we the number of basis functions to only six, to avoid convergence issues. In any case, the overall shape of the true trajectory is clearly being captured.

```{r, semiparametric-multivariate-gaussian-smooth2, fig.cap="Comparison of estimate to true value."}
# True function
f2 <- function(x) {
0.2 * x^11 * (10 * (1 - x))^6 + 10 *
(10 * x)^3 * (1 - x)^10
}
# Scale to mean zero across x
y <- f2(x) - mean(f2(x))

# Plot estimate
plot_smooth(mod, select = 2, scale = 0)
# Overlay true curve in red
lines(x, y, col = "red")

```

### Multivariate Binomial and Gaussian Model

We now extend the model above to also include the items measuring domain 2, which are all binomially distributed with a single trial.

We start by adding dummy variables for each domain, as before.

```{r}
mm <- model.matrix(~ 0 + domain, data = cognition)
dat <- cbind(cognition, mm)
```

Next we define the factor loading matrix, which now has three columns.

```{r}
(lmat <- matrix(
c(1, NA, NA, 0, 0, 0, 0, 0, 0,
0, 0, 0, 1, NA, 0, 0, 0, 0,
0, 0, 0, 0, 0, 1, NA, NA, NA),
ncol = 3))
```

The responses are now either conditionally Gaussian or conditionally binomial, so we define the family object and family mapping as follows:

```{r}
family <- c(gaussian, binomial)
family_mapping <- ifelse(dat$domain %in% c(1, 3), 1L, 2L)
```

Since this model will be computationally challenging, we need to think about the optimization. We define the control object as follows, where `trace = 3` means that we want the `optim()` function to be relatively verbose, `REPORT = 5` which means that we want it to report every fifth iteration, and `factr = 1e9` which means that when the reduction in marginal loglikelihood is within $10^{9}$ times machine precision, then convergence has occured. The default for `factr` is `1e7`, so we are less strict than default. Finally, `reduced_hessian = TRUE` means that at the maximum marginal likelihood solution that is found, the Hessian matrix should only contain partial derivatives with respect to the fixed regression coefficients and the factor loadings. That is, it should not contain derivatives with respect to the variance components, which for models of this complexity typically leads to non-positive definite Hessian matrices. This also means that Wald type confidence intervals based on this Hessian matrix are more approximate than usual, since they ignore the uncertainty in the variance components. However, the simulations in @sorensenLongitudinalModelingAgeDependent2023 suggest that the results confidence intervals still are quite good for most parameters.


```{r}
control <- galamm_control(
optim_control = list(trace = 3, REPORT = 5, factr = 1e9),
reduced_hessian = TRUE)
```


For this example, we also remove the level-2 disturbances, as they lead to further computational challenges which are not directly relevant to this example. For real data analysis in the first example of @sorensenLongitudinalModelingAgeDependent2023, level-2 disturbances were included and the model did converge, so it is not necessary in general to drop this level.

```{r}
mod <- galamm(
formula = y ~ domain +
sl(x, k = 6, by = domain,
load.var = c("ability1", "ability2", "ability3")) +
(0 + domain1:ability1 + domain2:ability2 + domain3:ability3 | id),
data = dat,
family = family,
family_mapping = family_mapping,
load.var = "item",
lambda = list(lmat),
factor = list(c("ability1", "ability2", "ability3")),
control = control
)
```

The summary output is shown below:

```{r}
summary(mod)
```

The estimated smooth terms are shown in the plots below.

```{r, semiparametric-mixedresponse-smooths1, fig.cap="First smooth term for semiparametric mixed response model."}
plot_smooth(mod, select = 1, scale = 0)
```

```{r, semiparametric-mixedresponse-smooths2, fig.cap="Second smooth term for semiparametric mixed response model."}
plot_smooth(mod, select = 2, scale = 0)
```

```{r, semiparametric-mixedresponse-smooths3, fig.cap="Third smooth term for semiparametric mixed response model."}
plot_smooth(mod, select = 3, scale = 0)
```

In complex models like this, where convergence is hard, the Nelder-Mead algorithm [@nelderSimplexMethodFunction1965] might be a good alternative to try, as it is usually more robust, although slower. See [optimization vignette](https://lcbc-uio.github.io/galamm/articles/optimization.html) for details.



# References
Binary file modified vignettes/semiparametric-gaussian-gamm4-diagnostic-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified vignettes/semiparametric-gaussian-gamm4-smooth-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified vignettes/semiparametric-mixedresponse-smooths1-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified vignettes/semiparametric-multivariate-gaussian-smooth1-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Loading