Rank-ordered latent class model (x-post from Reddit)

Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#1

Rank-ordered latent class model (x-post from Reddit)

04 May 2024, 12:07

I'm not the person who asked the question. But this post from Reddit is interesting.

Hi everyone,

I'm an economics PhD student, I'm looking for help to estimate an ordered latent-class model.

My dependent variable is a ranking carried out by respondents to a survey; they had to rank 4 items in a necessary descending order of preference (Chapman and Staelin, 82). But I think I think there's some heterogeneity in ranking capabilities. I found 2 papers that discuss about this issue and they advice to estimate a Latent-class rank-ordered model.

In latent class analysis, the latent class is an un-ordered categorical latent variable. If you looked through the LCA manual and saw a bunch of stuff about multinomial models and you wondered what that had to do with anything, that's why.

I believe that the poster was asking about estimating an ordered categorical latent variable model. Possibly an interesting question, I think I've heard these mentioned before. But there is no way to fit these models in Stata. I have no idea what software can fit them, but there is literature referencing them.

As an alternative, I would propose that the OP fit a latent class model. That's right, just treat the classes as un-ordered.

If you take a healthcare symptom scale and you fit an LCA model, it is virtually certain that most of your classes will look something like: one class is low in everything, one class is moderate in everything, one class is high in everything. There could be 2 or 3 of those classes, or there might be more. But those symptom scales were developed with the implication that the underlying latent variable is continuous. That is, you'd expect that some people are low, some are moderate, some are high. In the usual application, people are looking for the unusual classes, which you might interpret as atypical presentations of the thing being measured. As a made up example, maybe one group of patients assessed for depression were not likely to report dysphoria, but they were likely to report anhedonia, and they otherwise had similar response patterns to everyone else.

If you fit an ordered LCA model, I suspect you might miss the atypical classes if this is what you were looking for. Otherwise, imagine that some deity changed the rules of math such that there's no such thing as ordinal logistic regression, but we still have multinomial. Would it be so bad to fit a multinomial model to an observed variable you know is ordered categorical? Probably not. In LCA, you can subjectively assess if each class is low, medium, or high.

Say your set of indicators is not a set of symptoms from a scale. In this case, you are still likely to have at least one class that's low in most indicators or high in most indicators. Also, if you think there is not an underlying trait that's rank-orderable, then I think you should not be looking for an ordinal LCA.

Alternatively, you can fit an IRT model, which, again, assumes there's some sort of continuous underlying latent construct. Remember, models are approximations and abstractions of reality, and chances are that you can understand whatever phenomenon it is with a continuous latent variable, even if it's an imperfect fit.

Now, the continuous latent variable models assume a normally distributed latent variable. Perhaps you object that you have substantive grounds to think the distribution is not normal. In that case, yes, you're right that IRT is an imperfect fit. There are some people researching unipolar IRT models - disease symptoms would be a good candidate for this type of model! - but they aren't implemented in most software. You could probably fit these models in a Bayesian framework with existing software, though. Of course, you have to go write the likelihood of your assumed distribution. Also, it may not be clear which asymmetric distribution is optimal. Last, for better or worse, there are hundreds of papers applying IRT models to constructs that are probably unipolar, so at worst you will not be the only wrong person.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Tags: None
Hong Il Yoo

Join Date: Jan 2015

Posts: 292
#2

06 May 2024, 07:00

Weiwen Ng: You can estimate the rank-ordered logit model first by exploding the data as summarised in my background paper for -lclogit2- ([link]; see pp. 420-421 and Train's textbook chapter cited therein), and applying the -clogit- command to the exploded data. If you'd like to include a ranking capability parameter, you can apply Arne Risa Hole's -clogithet- to the data, and allow for heteroskedasticity with respect to whether the "pseudo-choice" response that you're looking at refers to first-best, second-best, third-best and so on. The idea is that if you're less certain about your second-best option than your first-best option, the logit model for your second-best pseudo-choice must exhibit greater variance than your first-best pseudo-choice, and this can be modelled as a type of heteroskedasticity.

To account for heterogeneity in ranking capabilities, you can estimate a latent class model where each class has its own -clogithet- parameters. In my 2013 paper with Denise Doiron in the Journal of Health Economics ([link]), we estimate this model, which we call "latent class heteroskedastic rank-ordered logit" (LHROL). The paper also clarifies the sense in which the heroeskedasticity parameter can be seen as a measure of ranking capabilities.

If you're interested using LHROL, you can download and use the -lclogithet- command, which is available as part of the replication package for my more recent paper with Denise Doiron in the Canadian Journal of Economics ([link]). I haven't uploaded it to SSC because...I somehow have failed to motivate myself to write the help file 😂 (one might say laziness in short). But the command line syntax is very similar to -lclogit2-, and if you're familiar with -lclogit2-, you'll be able to quickly figure out what option is for what.

Last edited by Hong Il Yoo; 06 May 2024, 07:05.
1 like
Comment

Announcement

Rank-ordered latent class model (x-post from Reddit)

Comment