Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to obtain marginal effects in latent class regression

    Latent class regression is an extension of latent class or profile analysis. Say you believe that class membership is influenced by some individual characteristic, but that characteristic is not a manifestation of your latent class. In example 50g of the SEM manual, there are 4 indicators, to which the example fits an LCA and essentially found that 27.9% of respondents were inclined to put society's needs above their own, and that 72.1% of people were selfish. Say we had gender in the data, and you wanted to see the effect of gender on the latent classes; you could do so through a multinomial regression. You could simply predict the modal class probabilities and tabulate gender by those probabilities, but that would ignore the classification uncertainty - i.e. we aren't always sure what latent class people are assigned to.

    Now, I'm going to switch to the data for example 52, since that dataset has an additional covariate. In example 54, we fit a 3-class model to the data. The 3 classes identified appear to correspond to people with overt diabetes, chemical diabetes (I think this could be described as pre-diabetes), and normal people. Those should be classes 3, 2, and 1 if you run the model; I don't believe that you have to set any particular random seed. That is the same order as they appear in the manual example.

    Code:
    use http://www.stata-press.com/data/r15/gsem_lca2
    gsem (glucose insulin sspg <- _cons), lclass(C 3) lcinvariant(none) covstructure(e._OEn, unstructured)
    The example data has patient relative weight included. It appears to be a continuous variable, with a mean of 0.9, SD of 0.129. You might guess that higher relative weight should be associated with higher odds of chemical diabetes or overt diabetes. To use relative weight in a latent class regression, this is the syntax you'd use. I'm including the output from the multinomial regression part:

    Code:
    gsem (glucose insulin sspg <- _cons) (C <- relwgt), lclass(C 3) lcinvariant(none) covstructure(e._OEn, unstructured)
    
    Output:              |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    1.C          |  (base outcome)
    -------------+----------------------------------------------------------------
    2.C          |
          relwgt |   14.03413   2.819101     4.98   0.000     8.508794    19.55947
           _cons |  -14.50264   2.864154    -5.06   0.000    -20.11628   -8.889005
    -------------+----------------------------------------------------------------
    3.C          |
          relwgt |   5.186345   2.045551     2.54   0.011     1.177138    9.195552
           _cons |  -5.329615   1.930139    -2.76   0.006    -9.112617   -1.546613
    That was what I expected. But, say I wanted to use margins to, for example, calculate the marginal effect of relative weight on the probability of each outcome. I am having a lot of trouble finding the right syntax. -estat lcmean- and -estat lcprob- don't produce information on relative weight; they produce the same output as with the first latent profile syntax. For the record, I'm using Stata 15.1.

    The gsem postestimation entry on margins says that -classpr- is an admissible statistic for margins. It also says that "classpr defaults to the first latent class if option class() is not specified." However, this doesn't appear to work as expected.

    Code:
    margins, dydx(relwgt) predict(classpr)
    
    Expression   : Predicted probability (1.C), predict(classpr)
    dy/dx w.r.t. : relwgt
    
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
          relwgt |  -1.640623   .2251536    -7.29   0.000    -2.081916    -1.19933
    
    
    margins, dydx(relwgt) predict(classpr class(2))
    
    Expression   : Predicted probability (2.C), predict(classpr class(2))
    dy/dx w.r.t. : relwgt
    
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
          relwgt |   1.672888   .2361939     7.08   0.000     1.209956    2.135819
    Neither of the above look like plausible changes in predicted probability to me. The syntax in the multinomial logit model postestimation doesn't work for me either. This is doubly odd, because this syntax was referenced in an earlier discussion on how to produce a profile plot after an LCA:

    Code:
    margins, dydx(*) predict(outcome(1)) predict(outcome(2)) predict(outcome(3))
    invalid outcome() option;
    depvar 1 not found
    r(198);
    Any thoughts, or am I simply mis-interpreting something?
    Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

    When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

  • #2
    Actually, those don't look unreasonable as marginal effects to me. In that data set, the variable relwgt ranges between 0.7 and 1.2. So the maximum realizable difference in relwgt is 0.5, and either of those marginal effects * 0.5 is within the range of a change in probability.

    Now, I have to admit I'm in a bit over my head here as I have never done a latent class analysis quite like this, and I may be misunderstanding how the (C <- relwgt) equation is handled. But, with what I have said in mind, does it make sense now? Or is it still problematic?

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      Actually, those don't look unreasonable as marginal effects to me. In that data set, the variable relwgt ranges between 0.7 and 1.2. So the maximum realizable difference in relwgt is 0.5, and either of those marginal effects * 0.5 is within the range of a change in probability.

      Now, I have to admit I'm in a bit over my head here as I have never done a latent class analysis quite like this, and I may be misunderstanding how the (C <- relwgt) equation is handled. But, with what I have said in mind, does it make sense now? Or is it still problematic?
      Uh ... yes. It makes sense. I'd forgotten about the range of relative weight. I just ran that syntax in my own data, and I am getting marginal effects that are sensible. My understanding is that the regression part of latent class regression is interpreted like a straight-up multinomial logit regression. If you run -estat lcmean- after the commands with and without the -(C <- relwgt)- spec, you'll see very similar (not identical, but very close) marginal means.

      So, thanks for saving me from an amateur error!
      Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

      When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

      Comment


      • #4
        Weiwen,

        Does this help?
        Code:
        margins, at(relwgt=(.7(.1)1.2)) predict(classpr class(1))
        marginsplot, scheme(s1color) name(marg1, replace)
        
        margins, at(relwgt=(.7(.1)1.2)) predict(classpr class(2))
        marginsplot, scheme(s1color) name(marg2, replace)
        
        margins, at(relwgt=(.7(.1)1.2)) predict(classpr class(3))
        marginsplot, scheme(s1color) name(marg3, replace)
        
        graph combine marg1 marg2 marg3, rows(1) altshrink name(margcomb, replace)
        Red Owl
        Stata/IC 15.1, Windows 10 (64-bit)

        Comment


        • #5
          Originally posted by Red Owl View Post
          Weiwen,

          Does this help?
          Code:
          margins, at(relwgt=(.7(.1)1.2)) predict(classpr class(1))
          marginsplot, scheme(s1color) name(marg1, replace)
          
          margins, at(relwgt=(.7(.1)1.2)) predict(classpr class(2))
          marginsplot, scheme(s1color) name(marg2, replace)
          
          margins, at(relwgt=(.7(.1)1.2)) predict(classpr class(3))
          marginsplot, scheme(s1color) name(marg3, replace)
          
          graph combine marg1 marg2 marg3, rows(1) altshrink name(margcomb, replace)
          Red Owl
          Stata/IC 15.1, Windows 10 (64-bit)
          That is brilliant. Thanks!
          Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

          When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

          Comment


          • #6
            Weiwen,

            Here's an alternative you may find more useful because it overlays the marginsplots in a single graph.

            This uses Nicholas Winter's combomarginsplot ado program from SSC.
            Code:
            ssc install combomarginsplot, replace
            
            margins, at(relwgt=(.7(.1)1.2)) predict(classpr class(1)) saving(marg1, replace)
            marginsplot, scheme(s1color) name(marg1, replace)
            
            margins, at(relwgt=(.7(.1)1.2)) predict(classpr class(2)) saving(marg2, replace)
            marginsplot, scheme(s1color) name(marg2, replace)
            
            margins, at(relwgt=(.7(.1)1.2)) predict(classpr class(3)) saving(marg3, replace)
            marginsplot, scheme(s1color) name(marg3, replace)
            
            combomarginsplot marg1 marg2 marg3, noci plotdim(_filenumber) labels("Class 1" " Class 2" "Class 3") file1opts(mcolor(blue) lcolor(blue)) file2opts(mcolor(green) lcolor(green)) file3opts(mcolor(brown) lcolor(brown)) legend(rows(1)) scheme(s1color) name(lrcombinedmargs, replace)
            Red Owl
            Stata/IC 15.1, Windows 10 (64-bit)

            Comment

            Working...
            X