Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Don't Understand Why the Results of Margins Command after Running a xtmelogit model is not Predicted Probabilities in Stata.

    Hi, friends,
    I have a dataset like this,
    insheet using "https://stats.idre.ucla.edu/stat/data/hdp.csv", comma
    foreach i of varlist familyhx smokinghx sex cancerstage school {
    encode `i', gen(`i'2)
    drop `i'
    rename `i'2 `i'
    }
    ssc inst center,all replace
    center co2 il6 crp lengthofstay

    I want to fit a model as follows,

    xtmelogit remission i.married c.c_il6 c.c_crp c.c_lengthofstay i.sex i.sex#c.cancerstage c.cancerstage c.c_co2 c.cancerstage#c.c_co2 c.cancerstage#c.cancerstage#c.c_co2 c.cancerstage#c.cancerstage#i.sex|| did:, intpoints(10) or

    margins sex#married,post


    I don't know why the results of the margins command above the predictive probabilities is not ranges from 0 to 1.
    However, the attached snapshot about the predicted probabilities of a regular logit model ranges from 0-1.
    Click image for larger version

Name:	snapshot.PNG
Views:	1
Size:	307.6 KB
ID:	1669504


    . margins sex#married,post

    Predictive margins Number of obs = 8,525

    Expression: Linear prediction, fixed portion, predict(xb)

    ------------------------------------------------------------------------------
    | Delta-method
    | Margin std. err. z P>|z| [95% conf. interval]
    -------------+----------------------------------------------------------------
    sex#married |
    female#0 | -1.484023 .1262418 -11.76 0.000 -1.731453 -1.236594
    female#1 | -1.621055 .1231191 -13.17 0.000 -1.862364 -1.379746
    male#0 | -1.423362 .1292779 -11.01 0.000 -1.676742 -1.169982
    male#1 | -1.560394 .1263787 -12.35 0.000 -1.808091 -1.312696
    ------------------------------------------------------------------------------


    Thank you for your help!
    Last edited by smith Jason; 16 Jun 2022, 10:16.

  • #2
    I think you may be using an old version of Stata. If you are, I don't think I can help you. If you are using current Stata, note that over time, the -me- commands have changed somewhat, and the options available for predictions with -margins- have also changed.

    The command -xtmelogit- no longer exists in current Stata. It has been renamed -meqrlogit-, and -meqrlogit-'s -margins- command only predicts the linear combination, not the predicted probability.

    To get predicted probabilities in current Stata, instead use the -melogit- command. It fits the same model as -meqrlogit-, but uses a different estimation method. The logistic regression output from both commands should be the same, with perhaps tiny discrepancies in distant decimal places. The reason for having the two different commands fitting the same model is that sometimes one will encounter convergence problems but the other will converge easily. So unless your model won't converge with -melogit-, you should avoid -meqrlogit- when you need predicted probabilities from -margins-. With -melogit-, the default prediction from -margins- is -mu-, the predicted probability.

    Comment


    • #3
      Thank you!

      Comment


      • #4
        margins sex, at(married=(0 1)) predict(mu fixedonly) vsquish

        Comment


        • #5
          xtmelogit still works in Stata 17.0 and there is no any error.
          Click image for larger version

Name:	Graph.png
Views:	1
Size:	57.6 KB
ID:	1669537

          Comment


          • #6
            I'm not sure you understand what you are getting from the code in #4. These are not the predictive margins for probabilities. They are predictive margins for probabilities subject to the constraint that the random effect is set to 0 for all observations. If that is what you want, then you are fine. But those predicted probabilities are generally not very meaningful or useful. And if you try to force it to give you predicted probabilities without that restriction, you get an error:
            Code:
            . margins sex#married, predict(mu)
            prediction is a function of possibly stochastic quantities other than e(b)
            r(498);
            If you want the true predictive margins for probabilities, you must use -melogit-, not -meqrlogit- (or its alias -xtmelogit-). If you do it with -melogit- you will see that the predicted probabilities you get are appreciably different from what you are seeing in the graph in #5.

            Comment


            • #7
              Thank you for your kindly response. I think that what I want is the predictive probabilities with the command "melogit".
              Can you show me the Stata code?
              Thanks!

              Comment


              • #8
                Code:
                . clear*
                
                . insheet using "https://stats.idre.ucla.edu/stat/data/hdp.csv", comma
                (27 vars, 8,525 obs)
                
                . foreach i of varlist familyhx smokinghx sex cancerstage school {
                  2. encode `i', gen(`i'2)
                  3. drop `i'
                  4. rename `i'2 `i'
                  5. }
                
                .
                .
                . melogit remission i.married c.il6 c.crp c.lengthofstay i.sex i.sex#c.cancerstage c.cancerstage c.co2 c.cancerstage#c.co2 c.cancerstage#c
                > .cancerstage#c.co2 c.cancerstage#c.cancerstage#i.sex|| did:, intpoints(10) or
                
                Fitting fixed-effects model:
                
                Iteration 0:   log likelihood = -5008.6963  
                Iteration 1:   log likelihood = -4998.3205  
                Iteration 2:   log likelihood = -4998.3115  
                Iteration 3:   log likelihood = -4998.3115  
                
                Refining starting values:
                
                Grid node 0:   log likelihood = -3839.1375
                
                Fitting full model:
                
                Iteration 0:   log likelihood = -3839.1375  
                Iteration 1:   log likelihood = -3721.2383  
                Iteration 2:   log likelihood = -3689.8803  
                Iteration 3:   log likelihood = -3683.7563  
                Iteration 4:   log likelihood = -3683.6672  
                Iteration 5:   log likelihood = -3683.6671  
                
                Mixed-effects logistic regression               Number of obs     =      8,525
                Group variable: did                             Number of groups  =        407
                
                                                                Obs per group:
                                                                              min =          2
                                                                              avg =       20.9
                                                                              max =         40
                
                Integration method: mvaghermite                 Integration pts.  =         10
                
                                                                Wald chi2(12)     =     407.36
                Log likelihood = -3683.6671                     Prob > chi2       =     0.0000
                ---------------------------------------------------------------------------------------------------
                                        remission | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
                ----------------------------------+----------------------------------------------------------------
                                        1.married |   .8719404   .0596896    -2.00   0.045     .7624597    .9971412
                                              il6 |   .9436841    .010938    -5.00   0.000     .9224877    .9653676
                                              crp |   .9778182   .0100197    -2.19   0.029     .9583758    .9976551
                                     lengthofstay |   .8623412   .0303097    -4.21   0.000     .8049353    .9238412
                                                  |
                                              sex |
                                            male  |   .7391656   .2614543    -0.85   0.393     .3695376    1.478512
                                                  |
                                sex#c.cancerstage |
                                            male  |   1.240862   .4306092     0.62   0.534      .628544    2.449689
                                                  |
                                      cancerstage |   87.80264    202.616     1.94   0.052     .9533441    8086.591
                                              co2 |   6.638147   9.360583     1.34   0.179     .4185531    105.2793
                                                  |
                              c.cancerstage#c.co2 |     .06535    .092914    -1.92   0.055     .0040273    1.060416
                                                  |
                c.cancerstage#c.cancerstage#c.co2 |   1.578352   .5054191     1.43   0.154     .8426218    2.956482
                                                  |
                  sex#c.cancerstage#c.cancerstage |
                                          female  |   .4019234   .2080408    -1.76   0.078     .1457305    1.108501
                                            male  |    .395134   .2028919    -1.81   0.071     .1444351    1.080976
                                                  |
                                            _cons |   .0719266   .1666527    -1.14   0.256     .0007668    6.746958
                ----------------------------------+----------------------------------------------------------------
                did                               |
                                        var(_cons)|   4.391226    .444662                      3.600741     5.35525
                ---------------------------------------------------------------------------------------------------
                Note: Estimates are transformed only in the first equation to odds ratios.
                Note: _cons estimates baseline odds (conditional on zero random effects).
                LR test vs. logistic model: chibar2(01) = 2629.29     Prob >= chibar2 = 0.0000
                
                .
                . margins sex#married, post
                
                Predictive margins                                       Number of obs = 8,525
                Model VCE: OIM
                
                Expression: Marginal predicted mean, predict()
                
                ------------------------------------------------------------------------------
                             |            Delta-method
                             |     Margin   std. err.      z    P>|z|     [95% conf. interval]
                -------------+----------------------------------------------------------------
                 sex#married |
                   female#0  |   .3016272    .015434    19.54   0.000     .2713772    .3318773
                   female#1  |   .2849564    .014587    19.53   0.000     .2563664    .3135464
                     male#0  |   .3068339   .0160788    19.08   0.000     .2753201    .3383478
                     male#1  |   .2898622   .0152675    18.99   0.000     .2599384    .3197861
                ------------------------------------------------------------------------------
                I skipped over the centering of co2, il6, crp, and lengthofstay. For the purposes of running the regression and getting predictive margins in this data, it makes no difference. But I don't use the -center- command myself and didn't want to install it just for this purpose. There are good reasons to center variables and for other things you plan to do with this you may need that. If you want to add that back in (and change the corresponding variable names in the -melogit- command), you can do so and will get the same results.

                By the way, you will also notice that -melogit- runs much faster than -xtmelogit- (-meqrlogit-) in this data.

                Comment


                • #9
                  Thank you very much, Professor!

                  Comment


                  • #10
                    By the way, if I want to compute the effect of the continuous variable "co2" while holding all the other predictors in their mean, how can I use the Stata code to do this?
                    Thank you for your guidance! I really don't know how to do this.

                    Comment


                    • #11
                      In your regression model, the co2 variable is interacted with variable cancerstage. Because it participates in an interaction, there is no such thing as the effect of variable co2. The model stipulates that "the effect" of co2 varies according to the value of the variable cancerstage. I notice that you are treating cancerstage as a continuous variable, but it actually only takes on the values 1 through 4. So you could get the marginal effects of co2 at each value of cancerstage with all others held at their means by using:

                      Code:
                      margins, dydx(co2) at(cancerstage = (1 2 3 4)) atmeans
                      There is also something called the average marginal effect, which is a single summary statistic about co2 effects:
                      Code:
                      margins, dydx(co2) atmeans
                      But be aware that this statistic is quite sensitive to the distribution of cancerstage in your sample and would not be expected to apply outside your sample.

                      Comment


                      • #12
                        Thank you! I still want to compute the predictive margins (predictive probabilities) of the effect of the continuous variable "co2".
                        So, the result of your code seems not the one I want.
                        Can I use the command "mcp" to do this and then marginsplot?

                        Comment


                        • #13
                          In #10, you said you wanted the "effect" of the continuous variable co2, so I pointed out that no such thing exists and gave you code for things that were somewhat similar to that.

                          It appears in #12, you perhaps want the predicted probabilities associated with the variable co2 with everything else at their means. Since co2 is a continuous variable you have to pick out specific values for it: unlike a discrete variable it does not define its own list of specific values. Let's say you want to look at the predicted outcome probabilities associated with co2 at values of 18, 22, 26, and 30. You could get that with
                          Code:
                          margins, at(co2 = (18(4)30)) atmeans
                          Again, be cautious in interpreting this. In using an interaction model you have stipulated that these results should also depend on the variable cancerstage, and this code will instead calculate the predicted probability if everybody's value of cancer stage were the mean value for your sample. Since cancer stage, although treated as a continuous variable in your regression, is really a discrete variable, it is possible that the resulting mean value of cancer stage will be, say, 2.7, which is non-existent in the real world, hence the results would be of questionable value, probably meaningless.

                          So I would suggest instead calculating statistics conditional on the four values of cancerstage. That would be:
                          Code:
                          margins, at(co2 = (18(4)30) cancerstage = (1 2 3 4)) atmeans

                          Finally, I will caution about the use of -atmeans- in a model that includes sex and married, two dichotomous variables. Here the mean value is almost guaranteed to represent a non-existent sex and a non-existent marital status. If you think these variables are not really important, then I suppose it doesn't really matter. But, in that case, why include them in the model to begin with. I would suggest exempting these from the -atmeans- by coding this as:

                          Code:
                          margins, at(co2 = (18(4)30) cancerstage = (1 2 3 4) (asobserved) sex married) atmeans
                          After all, there really is no such thing as a person of average sex or average marital status.

                          And if you vehemently objected to getting separate results for each cancer stage, and insist on a single statistic, it would make more sense to also treat it as (asobserved) than to just ignore it. After all, if you want to ignore it, then there was no reason to include the interaction in the first place. So:
                          Code:
                          margins, at(co2 = (18(4)30) (asobserved) cancerstage sex married) atmeans

                          Comment


                          • #14
                            Thank you very much!

                            Comment

                            Working...
                            X