Not achieving convergence in Latent Class Analysis

Calum Andrew

Join Date: May 2016
Posts: 32

Not achieving convergence in Latent Class Analysis

25 Sep 2021, 08:37

Hi all,

I'm exploring an LCA model with a household survey we've ran. This is looking at the activities that citizens do in their day-to-day lives (e.g. sports they play, technology they have access to, etc.). Each question has ordinal answers, as respondents answered on a Likert scale (e.g. 'I run every day', 'I run a few times a week', 'I run every month', ' I don't run', etc.).

I think there's value in exploring this data through an LCA approach to see if there is some framing we can provide beyond analysing the 2,000 responses in the aggregate. I've just started on this, and used the below code:

Code:

. gsem (Internet Phone Skilllevel Ownbusiness Gender <-), ologit lclass(C 3)

And received this error:

Code:

 convergence not achieved

Any thoughts or feedback would be welcome. I've included a data excerpt below. Thank you!

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input int Respondent byte(COVID19impact Internetaccess Phonepackage Reading Cycling Running Kayaking Skating Surfing Basketball Football Soccer Baseball Hiking Skilllevel Ownbusiness) int Age byte(Gender Education Urban)
  89 2 0 0 0 0 0 0 0 1 0 0 0 0 0 2 0 70 3 6 0
 164 3 0 0 1 1 1 0 0 1 1 1 0 0 1 3 1 61 2 9 2
 754 2 0 1 1 0 1 0 0 1 1 0 0 0 0 0 0 59 3 2 2
 763 3 0 1 1 0 1 1 1 1 1 1 0 1 0 1 0 37 3 2 3
 181 3 0 1 0 1 1 1 1 1 1 1 0 0 0 1 1 62 3 6 3
 657 3 0 1 0 1 1 0 0 1 1 1 0 1 0 1 1 64 3 4 2
 221 2 0 1 0 0 0 0 0 1 0 0 0 1 0 1 1 50 3 2 2
1025 3 0 1 0 1 1 0 0 1 1 1 0 0 0 1 2 33 3 4 3
1215 3 0 1 0 0 1 0 1 1 0 1 0 0 1 2 0 21 3 4 0
 769 3 0 1 0 1 1 1 1 1 1 0 0 0 0 2 0 31 3 5 1
 190 1 0 1 0 1 0 0 0 0 0 0 0 0 0 2 0 61 3 8 0
 267 3 0 1 0 1 1 0 0 1 1 1 0 1 0 2 0 40 3 2 2
 276 3 0 1 0 1 0 0 0 1 1 0 0 0 1 2 1 36 0 8 1
 384 2 0 1 0 1 1 1 1 1 1 1 1 0 0 2 1 56 2 4 3
 397 2 0 1 0 0 1 0 0 0 0 0 0 0 1 2 1 50 2 4 0
 544 3 0 1 0 0 1 1 1 0 1 0 0 1 0 2 1 57 3 2 2
 641 1 0 1 0 0 0 0 0 1 0 0 0 0 0 2 1 70 3 3 0
 377 3 0 1 0 1 1 0 0 1 1 1 0 1 0 2 1 26 3 2 2
 309 3 0 1 0 1 1 1 1 1 0 1 0 1 1 2 1 33 3 4 1
 102 2 0 1 0 0 0 0 0 0 0 0 0 0 1 2 1 20 3 9 1
1136 3 0 1 1 1 1 1 1 1 1 1 0 0 1 2 1 23 3 9 0
 903 1 0 1 0 0 0 0 0 0 1 1 0 1 0 2 1 50 3 9 0
 235 0 0 1 0 0 0 0 0 1 0 0 0 0 0 2 1 46 3 8 1
 183 2 0 1 0 1 0 0 0 0 0 0 0 0 0 2 2 59 2 3 1
1278 3 0 1 0 1 1 0 0 1 1 0 0 0 0 2 2 67 2 3 1
 307 3 0 1 0 0 1 1 1 1 1 1 0 1 1 2 2 49 3 8 2
 897 3 0 1 1 1 1 1 1 1 1 1 0 1 1 3 0 20 0 4 0
 418 2 0 1 0 0 1 0 0 0 0 0 0 0 0 3 0 23 0 9 0
1076 3 0 1 0 1 1 0 0 1 0 0 0 0 0 3 0 67 2 5 3
 365 2 0 1 0 0 1 1 0 0 0 1 0 0 0 3 0 23 2 4 2
 391 2 0 1 1 1 1 1 1 0 1 0 0 0 1 3 0 52 2 4 2
1113 2 0 1 1 1 1 1 1 1 1 1 1 1 1 3 0 21 2 2 0
1184 2 0 1 0 1 1 0 0 1 1 1 0 1 0 3 0 26 3 9 2
1269 3 0 1 0 1 1 0 0 1 1 1 0 0 0 3 0 23 3 4 2
  65 3 0 1 0 1 1 1 1 1 1 1 0 0 1 3 0 39 3 5 2
 131 3 0 1 1 1 1 1 1 1 1 1 1 0 1 3 0 52 3 6 3
1045 3 0 1 1 0 1 0 0 1 0 1 0 1 0 3 0 23 3 4 0
 717 3 0 1 1 0 1 1 1 1 0 1 0 1 1 3 0 18 3 2 2
 726 3 0 1 0 1 1 1 1 1 0 1 0 0 0 3 0 44 3 3 1
 996 3 0 1 1 1 1 0 1 1 1 0 0 0 0 3 1 45 2 8 3
1262 3 0 1 0 0 0 0 0 0 0 1 0 0 0 3 1 22 2 4 1
 500 3 0 1 1 1 1 0 0 1 1 1 0 0 1 3 1 43 3 2 0
1255 3 0 1 1 1 1 0 0 1 1 1 0 0 1 3 1 45 3 2 2
 472 3 0 1 1 1 1 1 1 1 1 1 1 0 0 3 1 59 3 2 2
 312 3 0 1 0 0 1 0 1 1 0 1 0 1 0 3 1 48 3 3 3
 925 2 0 1 1 0 1 1 1 0 0 0 0 1 1 3 1 30 3 8 2
 841 3 0 1 1 1 1 0 1 1 0 1 1 0 0 3 1 26 3 3 2
1284 2 0 1 1 0 1 1 1 1 1 1 0 0 1 3 1 23 3 4 0
 330 3 0 1 0 0 0 0 0 0 0 0 0 0 1 3 1 34 3 2 3
 367 3 0 1 0 0 1 1 1 1 1 1 0 1 0 3 1 29 3 2 3
 980 3 0 1 1 1 1 1 1 1 0 1 0 0 1 3 2 52 3 5 2
1250 3 0 1 1 1 1 1 1 1 0 0 0 0 0 3 2 34 3 4 2
 341 1 0 1 0 1 1 0 0 1 0 0 0 0 0 3 2 63 3 6 3
 253 2 0 1 0 1 1 0 1 1 1 1 1 0 0 3 3 48 3 5 1
 479 3 0 2 0 0 1 0 0 0 0 0 0 0 0 0 1 54 3 2 3
 757 3 0 2 0 0 1 1 1 0 1 1 0 0 0 1 0 55 0 9 2
1311 3 0 2 1 1 1 0 0 1 1 1 0 0 0 1 0 54 3 4 2
 439 3 0 2 0 0 0 0 0 1 0 1 0 0 0 1 0 69 3 6 1
1109 1 0 2 0 0 0 0 0 0 1 0 0 0 0 1 1 65 0 9 0
 378 2 0 2 0 0 1 0 0 0 0 0 0 0 0 1 1 76 2 2 0
 557 1 0 2 0 0 1 1 0 0 0 1 0 0 0 1 1 48 2 2 2
 578 2 0 2 1 1 0 1 1 1 1 1 0 0 1 2 0 48 2 3 3
 394 2 0 2 0 0 0 0 0 1 1 1 0 0 0 2 0 70 2 5 1
 579 3 0 2 0 0 1 1 0 1 1 1 0 1 0 2 0 45 3 4 2
1168 3 0 2 0 1 1 0 0 1 1 1 0 0 1 2 0 61 3 5 3
 547 3 0 2 0 0 0 0 0 0 0 0 0 0 1 2 0 64 3 8 2
1103 3 0 2 0 0 1 0 0 1 1 0 0 0 1 2 0 43 3 6 3
 698 2 0 2 1 1 1 0 0 1 1 1 0 1 0 2 0 55 3 4 1
 430 3 0 2 0 1 1 1 0 1 1 1 1 0 0 2 0 57 3 2 0
 619 3 0 2 1 1 1 1 1 1 0 1 0 0 0 2 0 69 3 4 2
1087 3 0 2 0 0 0 0 0 0 0 0 0 0 1 2 0 39 3 6 3
1241 3 0 2 1 1 1 1 0 1 1 1 0 0 0 2 0 59 3 4 1
 502 3 0 2 0 1 1 0 0 1 1 0 0 0 0 2 0 36 3 2 2
 401 3 0 2 0 0 1 1 0 1 1 1 1 0 1 2 0 48 3 3 1
1181 3 0 2 1 1 1 0 0 1 0 1 0 0 0 2 0 52 3 2 0
 420 3 0 2 0 0 0 1 0 1 1 1 0 0 0 2 1 57 2 4 0
 353 3 0 2 0 0 1 0 0 0 0 0 0 0 0 2 1 59 2 9 0
 871 3 0 2 1 1 1 1 1 1 1 0 0 0 0 2 1 44 2 6 1
 118 3 0 2 1 1 1 1 1 1 1 1 1 1 0 2 1 62 2 2 2
 513 2 0 2 1 1 1 1 1 1 1 1 0 0 0 2 1 50 3 4 3
 703 2 0 2 1 1 1 1 1 1 1 1 1 1 1 2 1 43 3 6 0
1052 3 0 2 0 0 0 0 0 0 0 0 0 0 1 2 1 22 3 2 1
 902 2 0 2 1 1 1 1 1 1 1 1 1 1 1 2 1 60 3 6 2
 753 3 0 2 0 0 1 0 0 1 1 0 0 0 0 2 1 56 3 2 2
 847 2 0 2 0 0 1 0 0 1 1 1 0 0 0 2 1 49 3 1 2
 151 3 0 2 0 1 0 0 0 0 1 1 0 1 0 2 1 50 3 8 1
 179 3 0 2 0 0 0 0 1 1 0 1 0 1 0 2 1 60 3 3 2
1308 3 0 2 1 0 0 1 1 1 1 1 0 0 1 2 1 40 3 2 3
1194 3 0 2 0 0 1 1 0 0 1 1 1 1 0 2 2 41 3 4 2
1135 3 0 2 1 1 1 0 1 1 0 0 0 0 0 3 0 39 2 3 1
 419 3 0 2 0 1 1 1 1 1 1 1 1 0 0 3 0 63 2 4 3
1231 2 0 2 1 1 1 1 1 1 1 1 1 1 1 3 0 22 2 8 2
 994 3 0 2 0 0 1 1 1 1 0 1 0 1 0 3 0 28 2 2 2
 822 3 0 2 0 0 1 1 1 1 0 1 0 1 0 3 0 23 2 5 2
 740 3 0 2 0 1 1 0 1 1 1 1 1 0 1 3 0 42 2 4 3
1046 2 0 2 0 0 1 0 0 0 1 1 0 0 0 3 0 71 2 1 0
1301 3 0 2 0 0 1 1 1 1 1 1 0 1 0 3 0 20 2 8 2
 316 3 0 2 1 1 0 1 0 1 0 1 0 1 0 3 0 69 3 8 1
 632 3 0 2 1 1 0 0 0 0 0 0 0 0 0 3 0 50 3 5 3
 264 3 0 2 0 1 0 0 0 1 1 1 0 0 0 3 0 72 3 2 2
end

Tags: None

Gio Russo

Join Date: Nov 2016

Posts: 5
#2

25 Sep 2021, 09:26

Hi, converge problems are not uncommon with LCA. the first thing you could try is to begin the iteration for the 3 - class solutions starting from the coefficients from the 2-class solution (if it converges)

Code:

gsem (Internet Phone Skilllevel Ownbusiness Gender <-), ologit lclass(C 2)

Code:

matrix B=e(b)'

Code:

gsem (Internet Phone Skilllevel Ownbusiness Gender <-), ologit lclass(C 3) from(B, skip)

other options for the selection of starting values can be found in the gsem entry of the stata manual: https://www.stata.com/manuals/semgse...imationoptions the option

Code:

startvalues

let you do some initial search for good starting values for the ML. You can also add the

Code:

difficult

option if convergence problems persist. I hope this helps
1 like
Comment
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#3

25 Sep 2021, 13:36

I've got some advice more specific to this situation. It will make more sense if you are familiar with the issue of complete separation in regular logistic regression, also known as complete determination. This is the issue where one value of the predictor is associated with only one of the outcomes. In logistic regression, Stata will drop the predictor variable entirely. The issue can also occur in ordinal or multinomial logistic regressions. Aside from dropping the predictor or collapsing values of the predictor, or maybe (less preferred) collapsing some values of the outcome, people have used penalized (i.e. Firth) regression and Bayesian regression (the prior will prevent the estimate of the offending parameter from wandering off to infinity).

LCA, Binary outcomes, and convergence issues
In latent class analysis with binary outcomes (I realize that yours are ordinal, and I'll get there), you also get issues with separation that prevent convergence. Say one of the latent classes has a prevalence of 0 on one indicator, or 1. The logit intercept is trying to wander off to - or + infinity. You'll see the log likelihood hit a ceiling with the issue "not concave", and then it will just keep iterating until the maximum number of iterations (which is now set at 300), but Stata won't declare convergence.

In this case, it is acceptable to constrain the offending intercepts at - or + 15 respectively, and to note (and report, if submitting a poster or a paper) how many parameters were constrained. Too many parameters constrained that way should be taken as a sign that you're trying to extract too many latent classes (i.e. discard this model, go back to the previous one). MPlus appears to do this automatically. I'll detail how to do this at the end of the response. Now, how many constraints is acceptable is a judgment call, there's no hard guideline. I haven't seen any LCA papers that report constraints to date, although I don't exactly read an enormous number of them. Anyway, I assume this situation is rare.

Ordinal outcomes
You have an additional complication: you're treating the outcomes as ordered logistic. Now, you should tabulate your dataset first. I don't know if all ordinal variables have the same number of response categories. However, you have sparse responses in some categories. In your sample data, for phone access, I see 2 zeroes, 52 ones, 46 twos, and no threes. Under skill level, I believe the distribution is 2, 11, 47, and 40.

Basically, you have the same problem as with binary indicators in the paragraph above, only you have more categories. While we normally prefer not to dichotomize categorical variables, I would consider it here. In your full data, if a variable has a really skewed distribution (e.g. most people say none, some say some, a handful say a lot), I think I would dichotomize it. This is one paper where we dichotomized a 4-category variable at the midpoint (disclosure, I'm 3rd author; that distribution was skewed the other way with most people saying high or very high for all the questions, and extremely few people in the lowest category). No reviewers protested this choice.

I assume you started your model fitting with two latent classes, and the model converged then - it is an option to report only two latent classes identified if you must keep the variables as ordinal.

I don't have experience issuing constraints for ordinal logistic indicators. I can say this: you could consider constraining the cut points appropriately. For an ordinal logistic regression with 4 categories that are coded 0 to 3, Stata will estimate 3 cutpoints. The first one is for the odds of responding 1 or higher vs. lower than 1. The last one is for the odds of responding 3 or higher vs lower - basically, that means responding a 3.

Examine the output from your model - which cutpoints have missing standard errors? I have a feeling that for the right-skewed variables, you might be able to constrain just the top cutpoint. Or maybe the top two cutpoints. If you need to constrain several cutpoints per variable in one or more latent classes, that's further impetus to dichotomize.

A specific note on gender: I am not sure that I would use gender as an indicator of the latent class. It's probably more like a predictor of class membership. That is, the latent class causes what I assume is a set of responses to various interests and hobbies. Gender is likely associated with the latent class, but I would argue for leaving it out of the LCA modeling.

If you are dead set on including gender as an indicator, which I do not recommend, then almost all the respondents are 2s and 3s, and there are a handful of 0s and no 1s. If you must include it as an indicator of the latent class (i.e. the latent class causes responses to gender), then I would definitely not treat it as ordered logistic; it would probably be more like unordered logit (multinomial). You probably only have a handful of nonbinary gender responses, which again will cause convergence trouble.

How to constrain
As outlined in this post, you can turn off Stata's secondary convergence criterion. This will allow Stata to declare convergence if a logit intercept exceeds + or -15. However, after you declare whatever constraints are necessary, you should then save the parameter estimates as start values, and estimate the model with the secondary convergence criterion back on. Some pseudocode:

Code:

gsem ( ... <-, ologit), nonrtolerance *examine your code, figure out what constraints are needed, issue them matrix b = e(b) gsem (... <-, ologit), from(b) constraint(1 2 ...)

Here is one example where someone ran an LCA model on ordered data. It did converge, but the first cutpoint in class #2 was -24. I think that corresponds to essentially nobody endorsing the lowest response option in that class. If you go the route of constraining cutpoints, you should examine your output. I don't have experience actually doing LCA on ordered data, so I am guessing as to the syntax. You can type

Code:

gsem, coeflegend

after a model to replay your estimates but to report the symbolic names for each coefficient. You would then have to use the constraint define syntax to issue each constraint. I don't believe you can issue a symbolic constraint in the gsem command for ordered logit cutpoints, whereas you can for logistic intercepts.

Miscellany
I believe that the difficult option that Gio discussed does not actually work on models with categorical latent variables (i.e. on LCA or FMM). I think one of the Statacorp people said this on the forum, but I can't recall when.

Starting values is another issue. When we get to higher numbers of latent classes, it's generally recommended to run many sets of randomly selected starting values, which does not occur with Stata's default settings. There is more explanation here. In my experience, it doesn't seem to be necessary for 2- and 3-class models; usually I do it anyway because otherwise I would have to explain to reviewers why I did this.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
1 like
Comment
Calum Andrew

Join Date: May 2016

Posts: 32
#4

25 Sep 2021, 22:39

Hi both,

This is incredibly helpful guidance! Apologies for the brevity of the response, but I'd like to spend some more time reading-up here. Thank you so much for the very considered, and thoughtful, responses!

Calum
Comment
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#5

28 Sep 2021, 17:01

I was running a few simulations on the issue. Here's a simulation of 3 latent classes (coded as 0, 1, and 2), 5 ordinal indicators coded as 0, 1, and 2. The heuristic you can use is that the indicators are questions that vary in difficulty, from least to most difficult, with 0 representing wrong and 1 representing partially correct.

The 3 latent classes vary in ability. Q5 is so hard that group 1 has a zero probability of getting it completely right.

The way to read the probabilities may be a bit counterintuitive. irecode() is a function that recodes a variable using the cutpoints specified after the first argument. Basically, 1 - the last probability is the probability of a maximum score. The second argument is the probability of getting the minimum score. The difference between the two arguments is the probability of an intermediate score. The first argument is basically just a random number. Needless to say, you can generate any number of cutpoints.

(NB: If you have this situation, you should ask yourself why you are not fitting an IRT model. A reasonable answer is that you think the items might not necessarily identify a unidimensional trait and/or that some people have atypical response sets. I chose this setup to ease understanding.)

Code:

clear set obs 3000 gen trueclass = irecode(runiform(),1/3, 2/3) * i.e. there's a 1/3 probability of belonging to each class; read the line above as code trueclass as 0 if runiform() < 1/3, 1 if 1/3 < runiform() < 2/3, and 2 if runiform() > 2/3 * with this seed, I got probabilities that are pretty close to equal. set seed 4142 forvalues k = 1/5 { generate q`k' = 0 } *Class 1, low ability replace q1 = irecode(runiform(), .2, .4) if trueclass == 0 replace q2 = irecode(runiform(), .3, .6) if trueclass == 0 replace q3 = irecode(runiform(), .4, .7) if trueclass == 0 replace q4 = irecode(runiform(), .6, .8) if trueclass == 0 replace q5 = irecode(runiform(), .8) if trueclass == 0 *That is, 80% of class 1 get Q1 wrong, 20% get it partially correct, and 0% get it correct. *Class 1, medium ability replace q1 = irecode(runiform(), .1, .2) if trueclass == 1 replace q2 = irecode(runiform(), .2, .4) if trueclass == 1 replace q3 = irecode(runiform(), .2, .4) if trueclass == 1 replace q4 = irecode(runiform(), .3, .6) if trueclass == 1 replace q5 = irecode(runiform(), .2, .7) if trueclass == 1 *Class 3, high ability replace q1 = irecode(runiform(), .05, .1) if trueclass == 2 replace q2 = irecode(runiform(), .1, .3) if trueclass == 2 replace q3 = irecode(runiform(), .2, .3) if trueclass == 2 replace q4 = irecode(runiform(), .2, .4) if trueclass == 2 replace q5 = irecode(runiform(), .3, .4) if trueclass == 2 gsem (q* <- ), logit lclass(C 3) byparm iterate(100) startvalues(randomid) est store lca3_v1 estat lcmean, nose

Note that the bold option is necessary with this seed (or it was for me), else the default start values are implausible. I can't post output because my Stata copy is on a remote and secure server. This model doesn't converge. I think that models simulated when you have one class with a 0 probability of something may converge in some (many?) instances. I observed at least one converged model while fiddling with the probabilities. It did identify the probability of class 1 (the low ability class) getting Q5 fully correct as something like 1%, which is not true but is pretty close to the simulated truth. However, this model didn't converge. Note that if you try this, there is no guarantee that the latent classes match the order of the true classes. With this seed on my computer, latent class 3 is the low ability one, and class 1 is the high ability one.

Remember that you can fit a model, then replay the estimates with the coeflegend option to get Stata to tell you the symbolic names of the coefficients.

If you examine your output, you'll see that cut 2 for latent class 3 was 4.4something when the model hit 100 iterations. The iteration log was at a ceiling, but the maximizer reported that it was backed up and refused to declare convergence with the normal criteria. Let's go and issue the constraint I alluded to earlier: the top cut point for class 3 gets constrained at 15. This should set the probability of latent class 3 scoring a 2 on Q5 (the hardest question) as 3.06 * 10^-7 - zero for practical purposes.

Code:

constraint 1 _b[/q5:3.C#cut2] = 15 estimates restore lca3_v1 matrix b = e(b) gsem (q* <- ), logit lclass(C 3) byparm iterate(100) from(b) constraint(1) nonrtolerance estat lcmean, nose

If you run that code, you'll see the model converges (without the secondary convergence criterion). I haven't inspected every class in detail, but the recovered predicted probabilities for class 3 look to be roughly the parameters we simulated. However, the estimated class membership probabilities are way off.

However, if you eliminate the nonrtolerance option in this simulated dataset, the maximizer will refuse to converge, indicating that the likelihood is backed up. I am not sure what this means, exactly. To be honest, this simulated dataset intentionally steered model towards something it's not designed to handle. It is maybe not surprising that it doesn't work well.

Note that I am simulating a dataset where I know the probability of a positive response to one of the indicators in one latent class is zero. That doesn't necessarily guarantee that the maximum likelihood estimate of the offending parameter will necessarily be infinity and will thus need some sort of constraint. Also, it does not exactly correspond to the situation in Calum's original post. However, from his data sample, it seems pretty likely that the top or bottom categories in the full dataset are very sparsely populated. And those are the marginal probabilities, before we have even started splitting things up into latent classes.

Essentially, this is how you would constrain parameters in an LCA with ordinal indicators. I can't guarantee that this is a good idea.

I realize this is not an R forum. I will say that the R package poLCA (Drew Linzer) can handle both binary and categorical indicators (I think it may only model these as multinomial). It converges fairly fast. When I simulated data using identical parameters, it demonstrated similar problems to Stata: it returned an error message saying that the maximum likelihood estimate wasn't found (even though I increased the max iterations to 5,000), and its estimated class proportions were far off (they should have been asymptotically equal). I didn't need to turn tolerance criteria off, and this package doesn't accept user-defined constraints at all. It may be worth exploring if alternative packages will converge where Stata doesn't, and I have a feeling that some may tend to be faster than Stata. However, when you have many items with sparse data at the ends of the distribution to begin with, I don't think anything will save you.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Comment
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#6

29 Sep 2021, 14:23

This is a general post on the issue of deciding when to constrain variables. I am not linking the article because I take issue with some of the methods, and I'm not out to name and shame people. In any case, it's a 2020 publication. I count 10 total indicators which I believe are all treated as binary (which meant that some were recoded to binary from 3 or more categories, so yes, that's more evidence that people do this). N = 1,836. I think, after going through this a bit, that a relatively small sample like that may increase the impetus to recode some categories as binary, but use your discretion.

The latent classes shown in a reproduction of their figure 1 represent risk profiles of social determinants of health among Black youth. In latent class #1, limited access to protective factors (3% of the sample, so maybe around 55 effective observations), the authors report 0 or near-0 probabilities for 6 of the 10 indicators. The article did not mention if any logit intercepts were constrained. I would argue to disclose this in the publication, even though a sharp eyed reviewer can see this from the profile plot or the probability table. Additionally, one logit intercept in a different latent class may also have been constrained.

Masyn's textbook chapter warns that too many constrained intercepts may be a warning that you're trying to extract too many latent classes. In addition, while I haven't mentioned this before, she does ask analysts to review the model-estimated proportion of each latent class (that's controlled by the multinomial intercepts high in the results table in Stata). If one class has too low a proportion (especially if the SE is missing for that class's intercept), that also may be a sign you're trying to extract too many latent classes. In plain language, disregard the current model and treat it as if it didn't converge.

With respect to the authors, if they did really constrain 6 parameters, I would have dropped the 4-class solution. Furthermore, that latent class is very small, so now you're uncertain if the class-specific response probabilities are sound or if they're subject to sample error. For example, they characterized this class as having limited access to protective factors. It makes sense that they would report poor food sufficiency, school safety, etc. Except, how come that latent class reports high neighborhood safety and high access to after-school activities? That doesn't really make intuitive sense to me. It would make more sense if they were low in everything. (NB: I am pretty sure the solid black line is right at 0, but I suppose it's possible that the model converged without issue around 1% or less than that; 1 out of 55 respondents would be a probability of 1.8%. They don't list exact probabilities anywhere in tables or appendices, unlike one article I referenced previously.)

Basically, if you have a few parameters constrained, I would strongly prefer you say that in the article and warn readers about the too many constraints issue, but I don't see a huge fundamental problem just yet. Whatever the threshold of how many constraints is too many, I think this article crossed it. Say their sample size doubled - that reduces the sampling error in the smallest class, so the model might converge with a lot fewer 0 probabilities (and thus no constraints).

In my simulation, I deliberately made the latent classes equal in size just for demonstration purposes. In reality, there's obviously no guarantee that the classes will be equal. In fact, I'd expect that many datasets will have one latent class that's smaller than the others and that is markedly higher or lower on at least some indicators. That class is probably interesting because of that. It's probably not just low or high on everything. (In contrast, the classes that are homogeneously high or low on everything aren't that interesting - I would often expect that sort of thing.) I argued that you need to be aware of sparseness among your indicators (the variables you're feeding into the LCA model). You may also want to think how large your smallest latent class is, and you might want to prepare for it to be a limiting factor in convergence. Andrew reported about 2k respondents. If his smallest class is 10% of the sample, that's may not be terrible depending on what the variable distribution looks like in that class. Smaller than that is higher risk.

If I were writing the paper I am responding to, I would probably present the results of the 3-class model. If the 4th class's characteristics looked sensible and if it was fewer than 6 intercepts constrained, I'd add that a 4-class model did emerge, but that the 4th class was small and we had to constrain a number of parameters. I'd mention the characteristics of that class, and say that a bigger sample would provide better estimates of that class's characteristics.

If you are reviewing a latent class paper, do generally be aware of these sparseness issues. If you see 0 probabilities in the graphs or tables, ask yourself how many there are. If there are just a couple, I would probably send Masyn's chapter to the authors and ask them nicely to disclose. More constraints = ask yourself and the authors harder questions, including do the characteristics of the latent classes make sense. If you can't tell for sure if there are constraints, ask the authors nicely.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Comment

Announcement

Not achieving convergence in Latent Class Analysis

Comment

Comment

Comment

Comment

Comment