Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Estimating a latent class logit with multuple observations per person using gsem or fmm.

    Hi everyone,

    I'm trying to estimate a latent class logit model (exactly the one as described in the lclogit package) using a finite mixture model either through gsem or fmm.
    I am unsure what the syntax is to tell the estimation that some observations belong to the same individual, and hence, they should be assigned the same latent class.

    Here's an example:
    Code:
     
    person y x
    1 0 12
    1 1 13
    2 0 14
    2 1 12
    2 0 15

    The response variable y is binary. The predictor is x, and the some persons may have multiple observations.
    (This is not the real data, it is simplified).

    I was trying the following syntax:
    Code:
    gsem (1: y <- x) (2: y <-) (C[person] <- ),logit latent(C) lclass(C 2)
    But adding the the [person] part does not seem to impact the estimation, it gets the same result as:
    Code:
    gsem (1: y <- x) (2: y <-) (C <- ),logit latent(C) lclass(C 2)
    With fmm, I am not even sure how to include this constraint on the latent variable.

    Any idea how this should be operationalized?

    Thanks,

    - Ron

  • #2
    As we're asked to explain, -lclogit- is a user-written package by Daniele Pacifico which appears to be available on SSC. I am not familiar with that package. That said, by my read of the documentation, it inherently assumes that you have data on multiple choice sets per person, so one can treat choice sets as nested within person. If Stata's latent class command supported random effects, I believe you could perform an equivalent or similar analysis.

    I hope someone more knowledgeable will correct me if I am wrong, but sadly, I don't believe that Stata's latent class command supports random effects. Random effects are basically continuous latent variables. The syntax that Rebecca Pope gave for a random effects IRT model (towards the end of her presentation) on simulated data was:

    Code:
     
     gsem (Theta@a -> d* H[hospital]@1, logit),    ///    variance(Theta@1) latents(Theta H)
    Adapting that to your syntax:

    Code:
     
     gsem (1: y <- x, logit) (2: y <-, logit) (<- H[person], gaussian) latent(C) lclass(C 2)
    You should get an error message saying that
    Code:
    option lclass() is not allowed with models specified with continuous latent variables
    r(198)
    Thus, it appears like the gsem command can't simultaneously estimate one continuous and one categorical latent variable. Before Stata 15, I am pretty sure that the gsem or sem commands could only estimate continuous latent variables. So, if you have those data, I think you are stuck with -lclogit- as written by Pacifico. That said, I could be wrong, and I'd welcome correction from Stata or anyone more knowledgeable. Although, when I raised this point in a previous question about multilevel latent class analysis, nobody came forward to correct me.
    Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

    When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

    Comment


    • #3
      Thanks Weiwen!

      The issue appears more my inability to tell Stata "all these observations come from the same person", and less the random effect - it would be fine if all people who belonged to the same class would have had the same fixed effect.

      I ended up using R's "gnml" which does it pretty quickly. It doesn't look like Stata supports it yet, or maybe I didn't look in the right place.

      Comment


      • #4
        Originally posted by Ron Berman View Post
        Thanks Weiwen!

        The issue appears more my inability to tell Stata "all these observations come from the same person", and less the random effect - it would be fine if all people who belonged to the same class would have had the same fixed effect.

        I ended up using R's "gnml" which does it pretty quickly. It doesn't look like Stata supports it yet, or maybe I didn't look in the right place.
        For my edification, can you clarify which R package? There doesn't appear to be a "gnml" package. There's a package called "gnlm", which is for generalized nonlinear regression, but I assume that's not the one you used.

        And actually, if you are fine with including a person fixed effect, then read my link again. That sounds like it would be equivalent to a latent class regression, where you put the person fixed effect in the multinomial equation (not the latent class bit). That is doable in principle! If there are a lot of people, Stata would probably struggle to estimate it, though.
        Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

        When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

        Comment

        Working...
        X