lrtest to compare SEM models

emiro molina

Join Date: Apr 2020
Posts: 7

lrtest to compare SEM models

21 Feb 2023, 12:28

Hi. I am trying to compare the fit of two SEM models. One has two latent variables and the other one only one of the two latent variables. The full model is
sem (CompDig -> cDa CDb cDc cDd cDe cDf) (OriDig -> oDa oDb oDd oDe);
the restricted model is
sem (CompDig -> cDa CDb cDc cDd cDe cDf).
I would like to perform a likelihood ratio test. However, you cannot directly apply the lrtest command because the models are not nested. In fact, the second model has less degrees of freedom than the first, since the covariance structure is smaller. What is the correct way of restricting the first model to make the second a model nested in the first? Restricting the oD’s to zero does not work, the matrix is not full rank. After many attempts I came to this solution (I have 301 observations. I am using Stata version 17):

Code:

qui sem (CompDig -> cDa - cDf) (OriDig -> oDa oDb oDd oDe)
estat gof
----------------------------------------------------------------------------
Fit statistic        |      Value   Description
---------------------+------------------------------------------------------
Likelihood ratio     |
         chi2_ms(34) |     77.691   model vs. saturated
            p > chi2 |      0.000
         chi2_bs(45) |   2334.832   baseline vs. saturated
            p > chi2 |      0.000
----------------------------------------------------------------------------
estimates store full
qui sem (CompDig -> cDa - cDf) (-> oDa oDb oDd oDe, nocon)
estat gof
. estat gof
----------------------------------------------------------------------------
Fit statistic        |      Value   Description
---------------------+------------------------------------------------------
Likelihood ratio     |
         chi2_ms(43) |   4028.621   model vs. saturated
            p > chi2 |      0.000
         chi2_bs(45) |   2334.832   baseline vs. saturated
            p > chi2 |      0.000
----------------------------------------------------------------------------
lrtest full

Likelihood-ratio test
Assumption: . nested within full

 LR chi2(9) = 3950.93
Prob > chi2 =  0.0000

Is this the correct procedure? At least the degrees of freedom and the size of the matrices are consistent. Thanks!!!

Tags: lrtest, SEM

Joseph Coveney

Join Date: Apr 2014

Posts: 4420
#2

22 Feb 2023, 00:48

Originally posted by emiro molina View Post

What is the correct way of restricting the first model to make the second a model nested in the first?

Something like the following should do it.

Code:

sem (CompDig -> cDa CDb cDc cDd cDe cDf) (OriDig -> oDa oDb oDd oDe), /// latent(CompDig OriDig) /// nocnsreport nodescribe nofootnote nolog estimates store Full sem (CompDig -> cDa CDb cDc cDd cDe cDf) (OriDig -> oDa@0 oDb@0 oDd@0 oDe@0), /// latent(CompDig OriDig) /// variance(OriDig@0) covariance(CompDig*OriDig@0) /// nocnsreport nodescribe nofootnote nolog lrtest Full

Restricting the oD’s to zero does not work, the matrix is not full rank.

If you've left the latent factor OriDig still in, then yeah you're trying to define a latent factor without any manifest variables to identify it. You need to constrain the latent factor's variance to zero, as well (see code above).

After many attempts I came to this solution

qui sem (CompDig -> cDa - cDf) (-> oDa oDb oDd oDe, nocon)

Is this the correct procedure?

You need to have the manifest variables' intercepts (means) in the model; otherwise, you're unintentionally testing whether they're all jointly zero in addition to anything that you're really interested in. Review the output of the nested model (second model) when you run the code that I show above. You'll see that the means of oDa, oDb, oDd and oDe are all still in the model; only their factor loadings on OriDig (along with OriDig's variance and the covariance of CompDig and OriDig) are all set to zero.
Attached Files

OriDig.smcl (14.6 KB, 1 view)

OriDig.do (789 Bytes, 1 view)
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4420
#3

22 Feb 2023, 01:17

By the way, you can write that second model (the nested model) like this

Code:

sem (CompDig -> cDa CDb cDc cDd cDe cDf) ( -> oDa oDb oDd oDe), /// latent(CompDig) /// nocnsreport nodescribe nofootnote nolog lrtest Full

which resembles what you show, but without the noconstant option for reasons that I give above. It gives identical results to the syntax above in #2, but I think that the syntax above in #2 for describing the nested model is more explicit as far as the constraints are concerned, that is, it's easier to see what's going on when reviewing the code.
Comment
emiro molina

Join Date: Apr 2020

Posts: 7
#4

22 Feb 2023, 10:25

Hi, Thanks very much for your prompt answer. In fact, as you say, restricting the variances and covariances makes the augmented model estimable. My doubt rests with the problem of correctly nesting the model in order to test the contribution with a lrtest. I understand your point "You need to have the manifest variables' intercepts (means) in the model; otherwise, you're unintentionally testing whether they're all jointly zero in addition to anything that you're really interested in"
For the full model, with 10 items, we have 10x11/2=55 covariances to fit. The(unstandardized) model (CompDig -> cDa - cDf) (OriDig -> oDa oDb oDd oDe) requires 21 parameters to be fitted, so that there are 34 df. In fact, by working with your simulated data, you obtain:

LR test of model vs. saturated: chi2(34) = 32.01

Of these df’s the ComDig section contributes with 12 df’s. When you intend to restrict the model to test if (OriDig -> oDa oDb oDd oDe) is redundant, I was under the impression that the restricted model should have 55-12= 43 df’s. Is that right? When we fit the model with the constants for the oD’s:

sem (CompDig -> cDa CDb cDc cDd cDe cDf) (OriDig -> oDa@0 oDb@0 oDd@0 oDe@0), ///
latent(CompDig OriDig) ///
variance(OriDig@0) covariance(CompDig*OriDig@0) ///
nocnsreport nodescribe nolog

we get 39 df’s. However when we use your excellent suggestion to fit the restricted model and we add the “nocons” restriction, we have a restricted model with the 43 df’s:

sem (CompDig -> cDa CDb cDc cDd cDe cDf) (OriDig -> oDa@0 oDb@0 oDd@0 oDe@0, nocons), ///
latent(CompDig OriDig) ///
variance(OriDig@0) covariance(CompDig*OriDig@0) ///
nocnsreport nodescribe nolog

LR test of model vs. saturated: chi2(43) = 371.16

Please enlighten me. Which is the correct way to restrict a model with two latent variables in order to contrast the full model with a restricted version that has only one latent variable?
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4420
#5

22 Feb 2023, 20:07

Originally posted by emiro molina View Post

Which is the correct way to restrict a model with two latent variables in order to contrast the full model with a restricted version that has only one latent variable?

Well, if I wanted to test whether a second latent factor contributes to a better fit, then I would omit the second and load all of its manifest variables onto the first latent factor.

Code:

sem (CcDa CDb cDc cDd cDe cDf <- CompDig) (oDa oDb oDd oDe <- OriDig), latent(CompDig OriDig) estimates store Full sem (CcDa CDb cDc cDd cDe cDf oDa oDb oDd oDe <- CompDig), latent(CompDig) lrtest Full

But if I wanted to test whether the parameters defining the second are jointly zero, then I would constrain them to zero, leaving the manifest variables' intercepts in, as in #2 (and #3) above. This is analogous to

Code:

regress y c.x estimates store Full regress y lrtest Full

Yours is analogous to the following, which you can do, but I suspect that it's not giving you what you're looking for.

Code:

regress y, noconstant lrtest Full

Last edited by Joseph Coveney; 22 Feb 2023, 20:20.
Comment
emiro molina

Join Date: Apr 2020

Posts: 7
#6

24 Feb 2023, 09:22

I see your point. Your #5 clarifies my doubts. You have been very helpful, thank you very much
Comment

Announcement

lrtest to compare SEM models

Comment

Comment

Comment

Comment

Comment