Latent curve model with structured residuals (LCM-SR)

Meagan Docherty

Join Date: Jan 2018
Posts: 7

Latent curve model with structured residuals (LCM-SR)

01 Apr 2020, 08:33

Hello,

I am attempting to fit a latent curve model with structured residuals (LCM-SR) using Stata's sem command in Stata SE 16.0. Details of the model can be found in this paper: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4067471/. It is essentially a latent growth curve model with regression paths among residuals, rather than among the observed repeated measures themselves. There is input and an example dataset available on the following website, but it is for Mplus: http://curran.web.unc.edu/lcm-sr-data-code/

The way the model is run in Mplus, and the way I thought it would have to be run in Stata, is that there are "phantom factors" (Curran's terminology) that are identified with a factor loading of 1 on the actual observed variables, in order to add regression paths among residuals. This means that any model will have at least as many latent variables as repeated measures, plus the latent growth factors. When I try to run this model, I get the following error (503) that says "too many latent variables":

Code:

Fitting saturated model:

Iteration 0:   log likelihood = -47025.969  
Iteration 1:   log likelihood = -40787.078  
Iteration 2:   log likelihood = -35921.064  
Iteration 3:   log likelihood = -35052.114  
Iteration 4:   log likelihood =  -34454.15  
Iteration 5:   log likelihood = -34344.854  
Iteration 6:   log likelihood = -34288.373  
Iteration 7:   log likelihood =  -34286.47  
Iteration 8:   log likelihood = -34286.458  
Iteration 9:   log likelihood = -34286.458  

Fitting baseline model:

Iteration 0:   log likelihood = -45450.075  
Iteration 1:   log likelihood = -45400.399  
Iteration 2:   log likelihood = -44914.559  
Iteration 3:   log likelihood = -44872.965  
Iteration 4:   log likelihood = -44870.909  
Iteration 5:   log likelihood = -44870.904  
Iteration 6:   log likelihood = -44870.904  
model not identified;
too many latent variables
r(503);

Here is an example using two variables across five time points each, with an intercept and linear slope for each, and where I've constrained autoregressive and cross-lagged paths among residuals to be equivalent, as well as residual variances and covariances from times 2-5. It, St, Iw, and Sw are all growth factors (intercepts and slopes), and the Rt* and Rw* are the phantom factors that are supposed to represent residuals.

Code:

use "http://www.stata-press.com/data/r13/nlswork", clear

keep idcode year wks_work ttl_exp

reshape wide wks_work ttl_exp, i(idcode) j(year)

sem (ttl_exp68 <- It@1 St@0 Rt68@1 _cons@0) ///
    (ttl_exp69 <- It@1 St@1 Rt69@1 _cons@0) ///
    (ttl_exp70 <- It@1 St@2 Rt70@1 _cons@0) ///
    (ttl_exp71 <- It@1 St@3 Rt71@1 _cons@0) ///
    (ttl_exp72 <- It@1 St@4 Rt72@1 _cons@0) ///
    (wks_work68 <- Iw@1 Sw@0 Rw68@1 _cons@0) ///
    (wks_work69 <- Iw@1 Sw@1 Rw69@1 _cons@0) ///
    (wks_work70 <- Iw@1 Sw@2 Rw70@1 _cons@0) ///
    (wks_work71 <- Iw@1 Sw@3 Rw71@1 _cons@0) ///
    (wks_work72 <- Iw@1 Sw@4 Rw72@1 _cons@0) ///
    (Rt69 <- Rt68@ar1 Rw68@cl1 _cons@0) ///
    (Rt70 <- Rt69@ar1 Rw69@cl1 _cons@0) ///
    (Rt71 <- Rt70@ar1 Rw70@cl1 _cons@0) ///
    (Rt72 <- Rt71@ar1 Rw71@cl1 _cons@0) ///
    (Rw69 <- Rw68@ar2 Rt68@cl2 _cons@0) ///
    (Rw70 <- Rw69@ar2 Rt69@cl2 _cons@0) ///
    (Rw71 <- Rw70@ar2 Rt70@cl2 _cons@0) ///
    (Rw72 <- Rw71@ar2 Rt71@cl2 _cons@0), ///
    means(It St Iw Sw) var(It St It*St Iw Sw Iw*Sw It*Iw St*Sw It*Sw Iw*St ///
    e.ttl_exp68@0 e.ttl_exp69@0 e.ttl_exp70@0 e.ttl_exp71@0 e.ttl_exp72@0 ///
    e.wks_work68@0 e.wks_work69@0 e.wks_work70@0 e.wks_work71@0 e.wks_work72@0 ///
    Rt68 e.Rt69@v1 e.Rt70@v1 e.Rt71@v1 e.Rt72@v1 Rw68 e.Rw69@v2 e.Rw70@v2 e.Rw71@v2 e.Rw72@v2 ///
    Rt68*Rw68 e.Rt69*e.Rw69@c e.Rt70*e.Rw70@c e.Rt71*e.Rw71@c e.Rt72*e.Rw72@c) method(mlmv)

This example produces the error above for too many latent variables. Am I doing something wrong? Is this just not possible to estimate in Stata? I've tried to search documentation for the maximum number of latent variables in sem, and I've been trying to think through model identification issues, but I'm coming up short. Any help would be really appreciated.

Tags: None

Announcement

Latent curve model with structured residuals (LCM-SR)