Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • LCA for multi-level data (using gsem)

    Hi Stata users,

    I am struggling on using gsem to perform LCA and predict class on group level.
    My data looks like this
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int groupID float choice byte(v1 v2 v3 v4)
    2 3 2 1 2 4
    2 2 3 3 4 1
    2 3 2 2 7 4
    2 2 3 3 5 2
    2 2 3 1 4 3
    2 4 2 2 5 2
    2 3 1 3 5 4
    2 2 2 1 4 4
    2 3 2 1 8 4
    2 3 3 2 9 2
    2 2 3 1 3 2
    2 2 3 1 8 4
    2 2 1 3 3 3
    2 2 1 1 7 2
    2 3 3 1 3 1
    3 4 1 2 4 4
    3 4 2 3 6 1
    3 4 1 1 1 1
    3 4 1 2 7 2
    3 4 1 2 3 4
    3 4 3 2 4 3
    3 2 1 1 8 2
    3 4 1 3 4 3
    3 4 1 1 6 1
    3 4 3 2 4 4
    3 4 2 1 3 1
    3 4 1 1 6 3
    3 4 3 1 9 3
    3 3 1 3 2 2
    3 4 2 2 3 1
    4 4 3 2 6 4
    4 4 2 3 4 4
    4 4 3 1 2 2
    4 4 1 2 5 4
    4 4 2 1 3 4
    4 4 2 3 5 1
    4 4 3 2 6 3
    4 3 2 2 9 2
    4 4 3 2 3 2
    4 3 1 2 7 1
    4 3 2 1 8 1
    4 4 1 2 5 3
    4 4 2 1 7 4
    4 4 1 1 1 4
    4 3 3 2 9 1
    5 5 1 2 7 4
    5 5 1 2 2 3
    5 5 1 1 8 4
    5 5 3 3 3 2
    5 5 3 3 8 2
    5 5 2 3 1 3
    5 5 3 1 7 1
    5 5 2 3 1 1
    5 5 1 3 3 1
    5 5 3 3 8 4
    5 5 2 1 6 3
    5 5 2 1 1 4
    5 5 2 3 5 4
    5 5 3 3 8 1
    5 5 2 1 9 3
    end

    Using gsem, I was able to run the model and predict posterior probabilities, and assign class for each row (for example, for model with 2 classes):
    Code:
    gsem ///
    (choice <- i.v1 i.v2 i.v3 i.v4), ///
     ologit lclass(C 2) 
    predict cpost*, classposteriorpr
    egen max = rowmax(cpost*)
    gen predclass = 1 if cpost1 == max
    replace predclass = 2 if cpost2 == max (cpost2)
    But what I need is prediction on groupID level so that I can get a class assigned to each groupID (the class within one groupID should not vary).

    Any suggestions on how to approach this issue are welcome!



  • #2
    Stata can't currently do multilevel latent class analysis. Is there any chance you have - and I'm not an economist, so I may be mangling the terminology - discrete choice data, e.g. each person has a set of choices, and you have data on each of the choices, and you want to use a latent class model there? If so, that's the lclogit package (by Pacifico and Yoo, available on SSC, paper here). In that setup, I believe the vector of class membership would be constant over the person (NB: you don't get a class assigned to each observation anyway, your model produces a vector of class membership, from which you can choose to do modal class assignment; this is a frequent misunderstanding.)

    I've never run multilevel latent class analysis, but I think that even in that model, the class membership vector would vary across the individual. Consider that latent transition analysis is one example of multilevel latent class data, and in that type of model, you're modeling individuals who are transitioning among latent classes by time.
    Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

    When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

    Comment


    • #3
      Thanks Weiwen for your reply! The choice in my data is ordered (from 1-5 'very unlikely' - 'very likely').

      It seems like the lclogit package is only suitable for binary outcome, so might not be suitable for my case.

      Comment

      Working...
      X