Estimating a latent class logit with multuple observations per person using gsem or fmm.

Ron Berman

Join Date: Feb 2018

Posts: 2
#1

Estimating a latent class logit with multuple observations per person using gsem or fmm.

16 Feb 2018, 15:51

Hi everyone,

I'm trying to estimate a latent class logit model (exactly the one as described in the lclogit package) using a finite mixture model either through gsem or fmm.
I am unsure what the syntax is to tell the estimation that some observations belong to the same individual, and hence, they should be assigned the same latent class.

Here's an example:

Code:

person y x

1 0 12

1 1 13

2 0 14

2 1 12

2 0 15

The response variable y is binary. The predictor is x, and the some persons may have multiple observations.
(This is not the real data, it is simplified).

I was trying the following syntax:

Code:

gsem (1: y <- x) (2: y <-) (C[person] <- ),logit latent(C) lclass(C 2)

But adding the the [person] part does not seem to impact the estimation, it gets the same result as:

Code:

gsem (1: y <- x) (2: y <-) (C <- ),logit latent(C) lclass(C 2)

With fmm, I am not even sure how to include this constraint on the latent variable.

Any idea how this should be operationalized?

Thanks,

- Ron
Tags: None
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#2

17 Feb 2018, 15:50

As we're asked to explain, -lclogit- is a user-written package by Daniele Pacifico which appears to be available on SSC. I am not familiar with that package. That said, by my read of the documentation, it inherently assumes that you have data on multiple choice sets per person, so one can treat choice sets as nested within person. If Stata's latent class command supported random effects, I believe you could perform an equivalent or similar analysis.

I hope someone more knowledgeable will correct me if I am wrong, but sadly, I don't believe that Stata's latent class command supports random effects. Random effects are basically continuous latent variables. The syntax that Rebecca Pope gave for a random effects IRT model (towards the end of her presentation) on simulated data was:

Code:

gsem (Theta@a -> d* H[hospital]@1, logit), /// variance(Theta@1) latents(Theta H)

Adapting that to your syntax:

Code:

gsem (1: y <- x, logit) (2: y <-, logit) (<- H[person], gaussian) latent(C) lclass(C 2)

You should get an error message saying that

Code:

option lclass() is not allowed with models specified with continuous latent variables r(198)

Thus, it appears like the gsem command can't simultaneously estimate one continuous and one categorical latent variable. Before Stata 15, I am pretty sure that the gsem or sem commands could only estimate continuous latent variables. So, if you have those data, I think you are stuck with -lclogit- as written by Pacifico. That said, I could be wrong, and I'd welcome correction from Stata or anyone more knowledgeable. Although, when I raised this point in a previous question about multilevel latent class analysis, nobody came forward to correct me.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Comment
Ron Berman

Join Date: Feb 2018

Posts: 2
#3

17 Feb 2018, 20:20

Thanks Weiwen!

The issue appears more my inability to tell Stata "all these observations come from the same person", and less the random effect - it would be fine if all people who belonged to the same class would have had the same fixed effect.

I ended up using R's "gnml" which does it pretty quickly. It doesn't look like Stata supports it yet, or maybe I didn't look in the right place.
Comment
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#4

18 Feb 2018, 07:13

Originally posted by Ron Berman View Post

Thanks Weiwen!

The issue appears more my inability to tell Stata "all these observations come from the same person", and less the random effect - it would be fine if all people who belonged to the same class would have had the same fixed effect.

I ended up using R's "gnml" which does it pretty quickly. It doesn't look like Stata supports it yet, or maybe I didn't look in the right place.

For my edification, can you clarify which R package? There doesn't appear to be a "gnml" package. There's a package called "gnlm", which is for generalized nonlinear regression, but I assume that's not the one you used.

And actually, if you are fine with including a person fixed effect, then read my link again. That sounds like it would be equivalent to a latent class regression, where you put the person fixed effect in the multinomial equation (not the latent class bit). That is doable in principle! If there are a lot of people, Stata would probably struggle to estimate it, though.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Comment

Announcement

Estimating a latent class logit with multuple observations per person using gsem or fmm.

Comment

Comment

Comment