Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • LCA using gsem (not concave) and convergence not achieved

    Hello Friends,

    I am conducting LCA using gsem. I have 6 binary indicator variables and 3725 participants.

    Code:
     gsem (variable_1 variable_2 variable_3 variable_4 variable_5 variable_6 <- , logit), lclass(C 2)
    Despite my best efforts, I continue to get feedback such as "not concave" and "convergence not achieved".
    This occurs for models with 2, 3, 4 and 5 classes.

    I have tried the startvalues options such as randomid, randompr and jitter. I have changed the iterate value to a higher number.
    I have used the difficult option.
    I am a bit unclear regarding the constraint options that work with matrix b; however, I have followed along with examples on other posts and this does not change the problem.

    I am feeling a bit frustrated because if I use the doLCA options in R Studio these problems do not occur and or are at least much easier to fix, additionally, the output is very similar. However, I am concerned about how my findings would be received if I reported results which it did not achieve convergence.

    If I use ologit the errors do not occur; however, as my variables are binary I know this not correct.

    I appreciate your guidance and support.

  • #2
    If you're familiar with ordinary logistic regression, what's the maximum likelihood estimate for a probability of zero or 1? It's +/- infinity. In logistic regression, Stata and other software will remove cases where there's complete determination, i.e. if some combo of predictors has only 0s or 1s, it will get dropped entirely and you get a warning message.

    What does that have to do with LCA? Well, it's likely that you have multiple response probabilities going towards zero or one. Those correspond to logit intercepts +/- 15 (or numbers > 15). Take the inverse logit of +/- 15; you'll see that it corresponds to a probability very, very near 1 or 0. If this doesn't make sense to you, try reading this chapter by Kathryn Masyn. Using the difficult option doesn't do anything here. Using different start value options doesn't fix this either.

    What you should do is: start with the nonrtolerance option to enumerate the model. This disables one of the two convergence criteria that Stata imposes. It will allow the optimizer to declare convergence even if one or more logit intercepts is around +/- 15. You would then constrain the parameters involved, and then you re-run the code without the nonrtolerance option.

    Other software packages probably don't use the same convergence criteria that Stata does. Given that you're already familiar with R, you could just use the poLCA package (I assume you meant this rather than doLCA, which I don't think is a package). Stata is admittedly a bit clunky in this regard. poLCA does automatically constrain indicators if they approach +/-15, as far as I know (and I've tested this a bit). That said, if too many indicators need to be constrained to +/-15, that's a sign that you are trying to over-extract latent classes (i.e. drop that solution and go to one fewer latent classes). How many is "too many"? There's no defined rule. Use your best judgment as a statistician. If you trawl through my posts, I ran into one paper where one latent class had half of the indicators constrained at + or - 15, plus that class was the smallest one identified, plus the latent class involved seemed to be internally inconsistent. I'd have rejected that solution. This is probably a rare scenario, but it depends on your data.

    Speaking of start values, you should be using them anyway, with multiple draws. jitter should not be relevant for binary indicators. Aside from that, I don't know that there's really a difference which option you choose, and I just default to randomid.
    Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

    When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

    Comment

    Working...
    X