Don't Understand Why the Results of Margins Command after Running a xtmelogit model is not Predicted Probabilities in Stata.

smith Jason

Join Date: Sep 2020

Posts: 378
#1

Don't Understand Why the Results of Margins Command after Running a xtmelogit model is not Predicted Probabilities in Stata.

16 Jun 2022, 10:01

Hi, friends,
I have a dataset like this,
insheet using "https://stats.idre.ucla.edu/stat/data/hdp.csv", comma
foreach i of varlist familyhx smokinghx sex cancerstage school {
encode `i', gen(`i'2)
drop `i'
rename `i'2 `i'
}
ssc inst center,all replace
center co2 il6 crp lengthofstay

I want to fit a model as follows,

xtmelogit remission i.married c.c_il6 c.c_crp c.c_lengthofstay i.sex i.sex#c.cancerstage c.cancerstage c.c_co2 c.cancerstage#c.c_co2 c.cancerstage#c.cancerstage#c.c_co2 c.cancerstage#c.cancerstage#i.sex|| did:, intpoints(10) or

margins sex#married,post

I don't know why the results of the margins command above the predictive probabilities is not ranges from 0 to 1.
However, the attached snapshot about the predicted probabilities of a regular logit model ranges from 0-1.

. margins sex#married,post

Predictive margins Number of obs = 8,525

Expression: Linear prediction, fixed portion, predict(xb)

------------------------------------------------------------------------------
| Delta-method
| Margin std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
sex#married |
female#0 | -1.484023 .1262418 -11.76 0.000 -1.731453 -1.236594
female#1 | -1.621055 .1231191 -13.17 0.000 -1.862364 -1.379746
male#0 | -1.423362 .1292779 -11.01 0.000 -1.676742 -1.169982
male#1 | -1.560394 .1263787 -12.35 0.000 -1.808091 -1.312696
------------------------------------------------------------------------------

Thank you for your help!

Last edited by smith Jason; 16 Jun 2022, 10:16.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#2

16 Jun 2022, 13:04

I think you may be using an old version of Stata. If you are, I don't think I can help you. If you are using current Stata, note that over time, the -me- commands have changed somewhat, and the options available for predictions with -margins- have also changed.

The command -xtmelogit- no longer exists in current Stata. It has been renamed -meqrlogit-, and -meqrlogit-'s -margins- command only predicts the linear combination, not the predicted probability.

To get predicted probabilities in current Stata, instead use the -melogit- command. It fits the same model as -meqrlogit-, but uses a different estimation method. The logistic regression output from both commands should be the same, with perhaps tiny discrepancies in distant decimal places. The reason for having the two different commands fitting the same model is that sometimes one will encounter convergence problems but the other will converge easily. So unless your model won't converge with -melogit-, you should avoid -meqrlogit- when you need predicted probabilities from -margins-. With -melogit-, the default prediction from -margins- is -mu-, the predicted probability.
1 like
Comment
smith Jason

Join Date: Sep 2020

Posts: 378
#3

16 Jun 2022, 13:32

Thank you!
Comment
smith Jason

Join Date: Sep 2020

Posts: 378
#4

16 Jun 2022, 13:39

margins sex, at(married=(0 1)) predict(mu fixedonly) vsquish
Comment
smith Jason

Join Date: Sep 2020

Posts: 378
#5

16 Jun 2022, 13:41

xtmelogit still works in Stata 17.0 and there is no any error.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#6

16 Jun 2022, 14:04

I'm not sure you understand what you are getting from the code in #4. These are not the predictive margins for probabilities. They are predictive margins for probabilities subject to the constraint that the random effect is set to 0 for all observations. If that is what you want, then you are fine. But those predicted probabilities are generally not very meaningful or useful. And if you try to force it to give you predicted probabilities without that restriction, you get an error:

Code:

. margins sex#married, predict(mu) prediction is a function of possibly stochastic quantities other than e(b) r(498);

If you want the true predictive margins for probabilities, you must use -melogit-, not -meqrlogit- (or its alias -xtmelogit-). If you do it with -melogit- you will see that the predicted probabilities you get are appreciably different from what you are seeing in the graph in #5.
Comment
smith Jason

Join Date: Sep 2020

Posts: 378
#7

16 Jun 2022, 15:02

Thank you for your kindly response. I think that what I want is the predictive probabilities with the command "melogit".
Can you show me the Stata code?
Thanks!
Comment

Clyde Schechter

Join Date: Apr 2014
Posts: 30117

16 Jun 2022, 15:38

Code:

. clear*

. insheet using "https://stats.idre.ucla.edu/stat/data/hdp.csv", comma
(27 vars, 8,525 obs)

. foreach i of varlist familyhx smokinghx sex cancerstage school {
  2. encode `i', gen(`i'2)
  3. drop `i'
  4. rename `i'2 `i'
  5. }

.
.
. melogit remission i.married c.il6 c.crp c.lengthofstay i.sex i.sex#c.cancerstage c.cancerstage c.co2 c.cancerstage#c.co2 c.cancerstage#c
> .cancerstage#c.co2 c.cancerstage#c.cancerstage#i.sex|| did:, intpoints(10) or

Fitting fixed-effects model:

Iteration 0:   log likelihood = -5008.6963  
Iteration 1:   log likelihood = -4998.3205  
Iteration 2:   log likelihood = -4998.3115  
Iteration 3:   log likelihood = -4998.3115  

Refining starting values:

Grid node 0:   log likelihood = -3839.1375

Fitting full model:

Iteration 0:   log likelihood = -3839.1375  
Iteration 1:   log likelihood = -3721.2383  
Iteration 2:   log likelihood = -3689.8803  
Iteration 3:   log likelihood = -3683.7563  
Iteration 4:   log likelihood = -3683.6672  
Iteration 5:   log likelihood = -3683.6671  

Mixed-effects logistic regression               Number of obs     =      8,525
Group variable: did                             Number of groups  =        407

                                                Obs per group:
                                                              min =          2
                                                              avg =       20.9
                                                              max =         40

Integration method: mvaghermite                 Integration pts.  =         10

                                                Wald chi2(12)     =     407.36
Log likelihood = -3683.6671                     Prob > chi2       =     0.0000
---------------------------------------------------------------------------------------------------
                        remission | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
----------------------------------+----------------------------------------------------------------
                        1.married |   .8719404   .0596896    -2.00   0.045     .7624597    .9971412
                              il6 |   .9436841    .010938    -5.00   0.000     .9224877    .9653676
                              crp |   .9778182   .0100197    -2.19   0.029     .9583758    .9976551
                     lengthofstay |   .8623412   .0303097    -4.21   0.000     .8049353    .9238412
                                  |
                              sex |
                            male  |   .7391656   .2614543    -0.85   0.393     .3695376    1.478512
                                  |
                sex#c.cancerstage |
                            male  |   1.240862   .4306092     0.62   0.534      .628544    2.449689
                                  |
                      cancerstage |   87.80264    202.616     1.94   0.052     .9533441    8086.591
                              co2 |   6.638147   9.360583     1.34   0.179     .4185531    105.2793
                                  |
              c.cancerstage#c.co2 |     .06535    .092914    -1.92   0.055     .0040273    1.060416
                                  |
c.cancerstage#c.cancerstage#c.co2 |   1.578352   .5054191     1.43   0.154     .8426218    2.956482
                                  |
  sex#c.cancerstage#c.cancerstage |
                          female  |   .4019234   .2080408    -1.76   0.078     .1457305    1.108501
                            male  |    .395134   .2028919    -1.81   0.071     .1444351    1.080976
                                  |
                            _cons |   .0719266   .1666527    -1.14   0.256     .0007668    6.746958
----------------------------------+----------------------------------------------------------------
did                               |
                        var(_cons)|   4.391226    .444662                      3.600741     5.35525
---------------------------------------------------------------------------------------------------
Note: Estimates are transformed only in the first equation to odds ratios.
Note: _cons estimates baseline odds (conditional on zero random effects).
LR test vs. logistic model: chibar2(01) = 2629.29     Prob >= chibar2 = 0.0000

.
. margins sex#married, post

Predictive margins                                       Number of obs = 8,525
Model VCE: OIM

Expression: Marginal predicted mean, predict()

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
 sex#married |
   female#0  |   .3016272    .015434    19.54   0.000     .2713772    .3318773
   female#1  |   .2849564    .014587    19.53   0.000     .2563664    .3135464
     male#0  |   .3068339   .0160788    19.08   0.000     .2753201    .3383478
     male#1  |   .2898622   .0152675    18.99   0.000     .2599384    .3197861
------------------------------------------------------------------------------

I skipped over the centering of co2, il6, crp, and lengthofstay. For the purposes of running the regression and getting predictive margins in this data, it makes no difference. But I don't use the -center- command myself and didn't want to install it just for this purpose. There are good reasons to center variables and for other things you plan to do with this you may need that. If you want to add that back in (and change the corresponding variable names in the -melogit- command), you can do so and will get the same results.

By the way, you will also notice that -melogit- runs much faster than -xtmelogit- (-meqrlogit-) in this data.

Comment

smith Jason

Join Date: Sep 2020

Posts: 378
#9

16 Jun 2022, 15:40

Thank you very much, Professor!
Comment
smith Jason

Join Date: Sep 2020

Posts: 378
#10

16 Jun 2022, 15:47

By the way, if I want to compute the effect of the continuous variable "co2" while holding all the other predictors in their mean, how can I use the Stata code to do this?
Thank you for your guidance! I really don't know how to do this.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#11

16 Jun 2022, 16:35

In your regression model, the co2 variable is interacted with variable cancerstage. Because it participates in an interaction, there is no such thing as the effect of variable co2. The model stipulates that "the effect" of co2 varies according to the value of the variable cancerstage. I notice that you are treating cancerstage as a continuous variable, but it actually only takes on the values 1 through 4. So you could get the marginal effects of co2 at each value of cancerstage with all others held at their means by using:

Code:

margins, dydx(co2) at(cancerstage = (1 2 3 4)) atmeans

There is also something called the average marginal effect, which is a single summary statistic about co2 effects:

Code:

margins, dydx(co2) atmeans

But be aware that this statistic is quite sensitive to the distribution of cancerstage in your sample and would not be expected to apply outside your sample.
1 like
Comment
smith Jason

Join Date: Sep 2020

Posts: 378
#12

16 Jun 2022, 18:54

Thank you! I still want to compute the predictive margins (predictive probabilities) of the effect of the continuous variable "co2".
So, the result of your code seems not the one I want.
Can I use the command "mcp" to do this and then marginsplot?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#13

16 Jun 2022, 19:25

In #10, you said you wanted the "effect" of the continuous variable co2, so I pointed out that no such thing exists and gave you code for things that were somewhat similar to that.

It appears in #12, you perhaps want the predicted probabilities associated with the variable co2 with everything else at their means. Since co2 is a continuous variable you have to pick out specific values for it: unlike a discrete variable it does not define its own list of specific values. Let's say you want to look at the predicted outcome probabilities associated with co2 at values of 18, 22, 26, and 30. You could get that with

Code:

margins, at(co2 = (18(4)30)) atmeans

Again, be cautious in interpreting this. In using an interaction model you have stipulated that these results should also depend on the variable cancerstage, and this code will instead calculate the predicted probability if everybody's value of cancer stage were the mean value for your sample. Since cancer stage, although treated as a continuous variable in your regression, is really a discrete variable, it is possible that the resulting mean value of cancer stage will be, say, 2.7, which is non-existent in the real world, hence the results would be of questionable value, probably meaningless.

So I would suggest instead calculating statistics conditional on the four values of cancerstage. That would be:

Code:

margins, at(co2 = (18(4)30) cancerstage = (1 2 3 4)) atmeans

Finally, I will caution about the use of -atmeans- in a model that includes sex and married, two dichotomous variables. Here the mean value is almost guaranteed to represent a non-existent sex and a non-existent marital status. If you think these variables are not really important, then I suppose it doesn't really matter. But, in that case, why include them in the model to begin with. I would suggest exempting these from the -atmeans- by coding this as:

Code:

margins, at(co2 = (18(4)30) cancerstage = (1 2 3 4) (asobserved) sex married) atmeans

After all, there really is no such thing as a person of average sex or average marital status.

And if you vehemently objected to getting separate results for each cancer stage, and insist on a single statistic, it would make more sense to also treat it as (asobserved) than to just ignore it. After all, if you want to ignore it, then there was no reason to include the interaction in the first place. So:

Code:

margins, at(co2 = (18(4)30) (asobserved) cancerstage sex married) atmeans
3 likes
Comment
smith Jason

Join Date: Sep 2020

Posts: 378
#14

16 Jun 2022, 19:59

Thank you very much!
Comment

Announcement