unreasonably high standard errors in latent class analysis using gsem?

Ben Spycher

Join Date: Sep 2023

Posts: 2
#1

unreasonably high standard errors in latent class analysis using gsem?

06 Sep 2023, 07:38

We are running a latent class analysis using gsem. The manifest variables are respiratory symptoms. The model output yields very high standard errors for the class specific probabilities for some symptoms. This is the case whenever the probability for a symptom is almost 0 (I assume because no-one with high posterior probability for that class has symptoms) or 1 (no-one all with some probability for the class have the symptom). This always results in confidence intervals (0,1). I do not understand why these standard errors should be affected in this way. They are much higher than those of other symptom variables. Here is the command:

gsem (frequent_cold pneumonia_1year_bin otitis_1year_bin night_cough rhinoconjunctivitis snoring_1year_bin cough_type_dry cough_2months <-, logit) (allergic_trig_num <-, poisson), lclass (C 4)startvalues(randompr, draws(20) seed(15)) emopts (iterate(30))nonrtolerance

Here's some output (logit scale) for the class specific symptom probabilities (first class, first lines only), note the high SE for rhinoconjunctivitus

-------------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
--------------------+----------------------------------------------------------------
frequent_cold |
_cons | -1.703044 .2933649 -5.81 0.000 -2.278028 -1.128059
--------------------+----------------------------------------------------------------
pneumonia_1year_bin |
_cons | -2.68173 .4399326 -6.10 0.000 -3.543982 -1.819478
--------------------+----------------------------------------------------------------
otitis_1year_bin |
_cons | -3.943182 1.177382 -3.35 0.001 -6.250808 -1.635555
--------------------+----------------------------------------------------------------
night_cough |
_cons | -.7104435 .2693421 -2.64 0.008 -1.238344 -.1825426
--------------------+----------------------------------------------------------------
rhinoconjunctivitis |
_cons | -13.9867 172.9968 -0.08 0.936 -353.0543 325.0809

On the probability scale this translate to probability for rhinoconjunctivitus of virtually 0 with a 95% CI of [0,1]. The class probability for this class is 0.255, that is about a quarter of the 531 end up in that class. So the estimates are based on a sizeable sample , and the non-zero or non-one probabilities are estimated with reasonable standard errors. Obviously above estimates are based on weighting with the posterior class probabilities of each subject, but that should not have such an impact?
Tags: None
Erik Ruzek

Join Date: Oct 2017

Posts: 442
#2

06 Sep 2023, 08:39

Some quick thoughts and questions...

In circumstances like this, I always try to start from a simpler model. For example, what happens to your latent class model when you only have the logit equation in it? Or when you remove rhinoconjunctivitis? Do you get similarly large SEs?

I can say that when you have very low incidence rates for a particular (0,1) variable in which it is an outcome (e.g., LCA or factor analysis), you need a very large sample size to be recover trustworthy parameters about it from a model.
2 likes
Comment
Ben Spycher

Join Date: Sep 2023

Posts: 2
#3

08 Sep 2023, 03:54

We did in fact build up from simpler models, the model shown is a final selected model for a publication and we are now responding to reviewer comments regarding these wide CIs. But thanks for the advice I will check how the CIs behaved in those models.

However it still seems puzzling, a low frequency (or even the absence) of a symptom within one latent class (not overall) should simply result in a probability estimate of zero with a CI upper limit reaching to some positive probability but not 1. I am wondering whether I should add another option for the variance estimation, but I could not find out from the manuals how the variance of these estimates of being estimated when the EM algorithm is being used.
Comment

Announcement

unreasonably high standard errors in latent class analysis using gsem?

Comment

Comment