Dear Stata users.
I have a question, maybe more related to the theory so please tell me if I am off-topic.
I was just wondering: what does it change if I remove the intercept that I have in each class-membership function in a latent profile analysis? And most important, when do you suggest to remove it (or constraing the intercepts to be equal across classes?).
Finally, what is the Stata command that allows me to do this? Thank you
Announcement
Collapse
No announcement yet.
X
-
Intercept in latent profile analysis
-
Originally posted by Andrea Baldin View Post...(or constraing the intercepts to be equal across classes?).
...
Code:constraint 1 [2.C]_cons = [1.C]_cons quietly gsem (glucose insulin sspg <- _cons), lclass(C 2) lcinvariant(none) constraint(1) nolog estat lcprob -------------------------------------------------------------- | Delta-method | Margin Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ C | 1 | .5 . . . 2 | .5 . . . -------------------------------------------------------------- estimates table --------------------------- Variable | active -------------+------------- 1b.C | _cons | (omitted) -------------+------------- 2.C | _cons | 0 -------------+------------- glucose | C | 1 | 35.918484 2 | 76.94573 -------------+------------- insulin | C | 1 | 16.486854 2 | 21.213746 -------------+------------- sspg | C | 1 | 11.037997 2 | 27.461208 -------------+------------- var(e.gluc~e)| C | 1 | 21.977333 2 | 1265.8426 var(e.insu~n)| C | 1 | 26.146252 2 | 278.78742 var(e.sspg)| C | 1 | 23.972947 2 | 70.536606 ---------------------------
-
Originally posted by Andrea Baldin View PostThank you Weiwen for your reply.
I mean the intercept in the membership function, when I add covariates. But I think your reasoning holds also in this case. I just remember that the software Latent Gold allows for this option and in different (ma not all) papers that apply the latent profile analysis, the intercept is not included among the results
However, I'm still not sure what you mean by omitting the intercepts from the multinomial model. I don't think the multinomial model works at all if there are no intercepts. The intercepts control the proportion of the sample that's in each latent class. If a paper omitted presenting the intercepts, the latent class/profile model would still have estimated them behind the scenes. If you had constrained the intercepts to be equal across all classes, you'd be telling Stata to operate under the constraint that the proportions of each latent class are equal, which is not something I have ever seen anybody do.
Per the manual, the probability of being in latent class 1 is:
P(C = 1) = exp(gamma1) / [exp(gamma1) + exp(gamma2)]
where gamma-c is the intercept for the c-th latent class, and gamma1 = 0 because it's the base class.
So, you can verify for yourself from the table above that, by the formula, P(C = 1) = 1 / [1 + exp(-.236545)] = 0.5586204. Or you can use the appropriate postestimation command:
Code:estat lcprob Latent class marginal probabilities Number of obs = 145 -------------------------------------------------------------- | Delta-method | Margin Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ C | 1 | .558862 .0445136 .4706988 .6434637 2 | .441138 .0445136 .3565363 .5293012 --------------------------------------------------------------
If all you wanted was to export your results to Excel without the multinomial intercepts, you could use coefplot (avail. on SSC) and the drop option:
Code:estout ., drop(1b.C:* 2.C:*) ------------------------- . b ------------------------- glucose 1.C 35.98797 2.C 77.638 ------------------------- insulin 1.C 16.5196 2.C 21.26216 ------------------------- sspg 1.C 11.17919 2.C 27.59469 ------------------------- / var(e.gluc~C 22.62693 var(e.gluc~C 1263.401 var(e.insu~C 26.36603 var(e.insu~C 283.2775 var(e.sspg~C 25.26045 var(e.sspg~C 70.49358 -------------------------
Last edited by Weiwen Ng; 06 Feb 2019, 14:50.
Leave a comment:
-
Thank you Weiwen for your reply.
I mean the intercept in the membership function, when I add covariates. But I think your reasoning holds also in this case. I just remember that the software Latent Gold allows for this option and in different (ma not all) papers that apply the latent profile analysis, the intercept is not included among the results
Leave a comment:
-
Andrea,
As far as I'm concerned, theory questions are on topic. If the theory is too obscure, then nobody may be able to respond. However, I'm not clear what you're asking. To recap from SEM example 52, we're taking 3 indicators and fitting a model like below:
glucose = a1k + e.glucose
insulin = a2k + e.insulin
sspg = a3k + e.sspg
Where a is the intercept, the first digit after a indexes the 3 indicators, and the second digit indexes the latent classes.
I don't think you can remove those intercept. They denote the mean level of each indicator in each class.
If you just meant to omit the _cons from the gsem command, then yes, it looks like you can, and it makes no difference:
Code:use http://www.stata-press.com/data/r15/gsem_lca2 quietly gsem (glucose insulin sspg <- _cons), lclass(C 2) lcinvariant(none) est store c2variant quietlygsem (glucose insulin sspg <- ), lclass(C 2) lcinvariant(none) est store c2variant_nointercept est table c2variant* ---------------------------------------- Variable | c2variant c2varian~t -------------+-------------------------- 1b.C | _cons | (omitted) (omitted) -------------+-------------------------- 2.C | _cons | -.236545 -.236545 -------------+-------------------------- glucose | C | 1 | 35.987969 35.987969 2 | 77.638 77.638 -------------+-------------------------- insulin | C | 1 | 16.519601 16.519601 2 | 21.262161 21.262161 -------------+-------------------------- sspg | C | 1 | 11.179191 11.179191 2 | 27.594687 27.594687 -------------+-------------------------- var(e.gluc~e)| C | 1 | 22.626931 22.626931 2 | 1263.401 1263.401 var(e.insu~n)| C | 1 | 26.366033 26.366033 2 | 283.27753 283.27753 var(e.sspg)| C | 1 | 25.260446 25.260446 2 | 70.493577 70.493577 ----------------------------------------
Or did you mean to constrain the error variances to be equal across classes? (Note, they are constrained by default unless you invoke the lcinvariant(...) option.) I like to think about latent profile analysis as taking a magic elliptical cookie cutter, and you are taking k stamps out of a (multidimensional) sheet of cookie dough. If you constrain the error variances to be equal across classes, it's like you're taking equal-sized stamps each time. If you don't constrain the error variances to be equal, your cookie cutter will re-size itself between stamps.
Honestly, I'm not sure why the identity covariance structure (across classes, all errors have equal variance, all error terms have zero covariance) is default. It seems very restrictive. In the R package flexmix, which looks like it offers a close parallel to Stata's capabilities in gsem, the only options for the covariance structure appear to be diagonal (across classes, all error variances unrestricted, all error covariances zero) and full or unstructured (all error variances and covariances distinctly estimated).
As a side note, figure 6 in this document about flexmix shows a nice illustration of what happens when you fit a model with a diagonal versus unstructured covariance. It's harder to illustrate this in Stata because there isn't a convenient way to draw circles corresponding to the class-specific means and variances on a scatterplot.
Leave a comment:
Leave a comment: