Estimating probability using margins after mixed effect logistic regression

Angeline Wong

Join Date: Sep 2023
Posts: 2

Estimating probability using margins after mixed effect logistic regression

05 Sep 2023, 05:11

Dear statalists,

I'm looking to generate the estimation of the outcome y (people reporting having gotten a certain healthcare) per region through the years.

I am using the command:

melogit y i.region age i.sex i.urban, or
margins i.region, by(year)

year#region	Margin	std. err.	[95% conf.	interval]
2011#Region A	0.176543	0.00068	0.175211	0.177875
2013#Region A	0.174814	0.000676	0.173489	0.17614
2015#Region A	0.177897	0.00068	0.176565	0.17923
2018#Region A	0.226306	0.000822	0.224695	0.227917

I also tried using:

melogit y i.region age i.sex i.urban if year==2013, or
margins i.region

I get the following:

Region	Margin	std. err.	[95% conf.	interval]
Region A	0.137029	0.001171	0.134733	0.139324

There are 4 regions in total, but to spare the confusion i'm only giving the results given for Region A. As you can see, these two commands give very different results for Region A in 2013, for example. How do i know which one I should be using? Thanks.

Edit: I'm using Stata 17.0.

Last edited by Angeline Wong; 05 Sep 2023, 05:47.

Tags: None

Andrew Musau

Join Date: Oct 2014
Posts: 10275

05 Sep 2023, 06:53

The samples in the two regressions differ. Here

melogit y i.region age i.sex i.urban if year==2013, or

you are restricting the sample to only the year 2013 whereas here

melogit y i.region age i.sex i.urban, or

you are using all the data. So the estimated coefficients differ across the regressions. The margins are just the predicted values based on these estimates. See the following for how to calculate the margins "by hand":

Code:

sysuse auto, clear
*SUBSAMPLE
regress mpg weight i.foreign if rep78==3
margins foreign if rep78==3, by(rep78) atmeans
di (_b[1.foreign]*1)+_b[_cons]+ (_b[weight]*3299)

*FULL SAMPLE
regress mpg weight i.foreign
margins foreign if rep78==3, by(rep78) atmeans
di (_b[1.foreign]*1)+_b[_cons]+ (_b[weight]*3299)

Res.:

Code:

. *SUBSAMPLE

.
. regress mpg weight i.foreign if rep78==3

      Source |       SS           df       MS      Number of obs   =        30
-------------+----------------------------------   F(2, 27)        =     27.92
       Model |  335.255272         2  167.627636   Prob > F        =    0.0000
    Residual |  162.111394        27  6.00412572   R-squared       =    0.6741
-------------+----------------------------------   Adj R-squared   =    0.6499
       Total |  497.366667        29  17.1505747   Root MSE        =    2.4503

------------------------------------------------------------------------------
         mpg | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      weight |  -.0051142   .0007429    -6.88   0.000    -.0066385   -.0035899
             |
     foreign |
    Foreign  |  -2.991368   1.831882    -1.63   0.114     -6.75008    .7673443
       _cons |   36.60429   2.600289    14.08   0.000     31.26893    41.93964
------------------------------------------------------------------------------

.
. margins foreign if rep78==3, by(rep78) atmeans

Adjusted predictions                                        Number of obs = 30
Model VCE: OLS

Expression: Linear prediction, predict()
Over:       rep78
At: weight    = 3299 (mean)
    0.foreign =   .9 (mean)
    1.foreign =   .1 (mean)

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
     foreign |
   Domestic  |   19.73247   .4834206    40.82   0.000     18.74057    20.72437
    Foreign  |    16.7411   1.708312     9.80   0.000     13.23594    20.24627
------------------------------------------------------------------------------

.
. di (_b[1.foreign]*1)+_b[_cons]+ (_b[weight]*3299)
16.741102

.
.
.
. *FULL SAMPLE

.
. regress mpg weight i.foreign

      Source |       SS           df       MS      Number of obs   =        74
-------------+----------------------------------   F(2, 71)        =     69.75
       Model |   1619.2877         2  809.643849   Prob > F        =    0.0000
    Residual |  824.171761        71   11.608053   R-squared       =    0.6627
-------------+----------------------------------   Adj R-squared   =    0.6532
       Total |  2443.45946        73  33.4720474   Root MSE        =    3.4071

------------------------------------------------------------------------------
         mpg | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      weight |  -.0065879   .0006371   -10.34   0.000    -.0078583   -.0053175
             |
     foreign |
    Foreign  |  -1.650029   1.075994    -1.53   0.130      -3.7955    .4954422
       _cons |    41.6797   2.165547    19.25   0.000     37.36172    45.99768
------------------------------------------------------------------------------

.
. margins foreign if rep78==3, by(rep78) atmeans

Adjusted predictions                                        Number of obs = 30
Model VCE: OLS

Expression: Linear prediction, predict()
Over:       rep78
At: weight    = 3299 (mean)
    0.foreign =   .9 (mean)
    1.foreign =   .1 (mean)

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
     foreign |
   Domestic  |   19.94627   .4726151    42.20   0.000      19.0039    20.88863
    Foreign  |   18.29624   .9591353    19.08   0.000     16.38377     20.2087
------------------------------------------------------------------------------

.
. di (_b[1.foreign]*1)+_b[_cons]+ (_b[weight]*3299)
18.296236

.

Notice that the only differences in the calculation of the estimated margins are the values of the coefficients.

How do i know which one I should be using?

You probably want to use the coefficients estimated from the full sample.

Last edited by Andrew Musau; 05 Sep 2023, 06:55.

Comment

Angeline Wong

Join Date: Sep 2023
Posts: 2

05 Sep 2023, 09:59

I see, thank you for the explanation. The years that I have are cross-sectional years, so I'm thinking it would make more sense to restrict the sample to each individual years when estimating margins, rather than have them pooled, right?

Some further questions regarding what you've posted there Andrew, what is the difference between the default option (predict) and the atmeans that you're using here? I assume that atmeans would not be the option to go for since i'm using a logistic regression?

Also, if I want to assess the trends of the years for each region, I read on the guide that I can use the contrast option, but what is the method used by Stata to assess the trend?

Code:

. margins i.rep78, by(year) contrast

Contrasts of predictive margins                             Number of obs = 59
Model VCE: OIM

Expression: Predicted mean, predict()
Over:       year

------------------------------------------------
             |         df        chi2     P>chi2
-------------+----------------------------------
  rep78@year |
       2005  |          2       16.24     0.0003
       2006  |          2        9.37     0.0092
       2008  |          2       10.76     0.0046
       2009  |          2       14.06     0.0009
       2010  |          2       18.17     0.0001
       2011  |          2       14.04     0.0009
       2012  |          2       18.16     0.0001
       2013  |          2        9.22     0.0100
       2014  |          2       10.91     0.0043
       2015  |          2        9.00     0.0111
       2016  |          2       12.16     0.0023
       2017  |          2        6.59     0.0370
       2018  |          2        4.96     0.0836
       2020  |          2        2.82     0.2446
       2021  |          2        2.99     0.2246
       2022  |          2        2.41     0.2998
       2023  |          2        3.89     0.1431
      Joint  |          5       18.16     0.0028
------------------------------------------------

Sorry for the mess in the copied output earlier, I didn't realize that I could use the code option.

Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10275

05 Sep 2023, 23:30

Originally posted by Angeline Wong View Post

For the regression, use the full sample. For the margins, you may compute these by year if such comparisons are of interest.

what is the difference between the default option (predict) and the atmeans that you're using here? I assume that atmeans would not be the option to go for since i'm using a logistic regression?

Richard Williams illustrates the calculations here and outlines the merits and demerits of each.

Also, if I want to assess the trends of the years for each region, I read on the guide that I can use the contrast option, but what is the method used by Stata to assess the trend?

If I understand you correctly, you are asking what test is Stata doing? It is just a standard Wald test which you can replicate using the test command.

Code:

webuse lbw, clear
logit low age smoke i.race
margins race, contrast

*USING TEST
margins race, post
test (1.race=2.race) (1.race=3.race)

Res.:

Code:

. margins race, contrast

Contrasts of predictive margins                            Number of obs = 189
Model VCE: OIM

Expression: Pr(low), predict()

------------------------------------------------
             |         df        chi2     P>chi2
-------------+----------------------------------
race |          2        9.07     0.0107
------------------------------------------------

. 
. 
. 
. *USING TEST

. 
. margins race, post

Predictive margins                                         Number of obs = 189
Model VCE: OIM

Expression: Pr(low), predict()

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
        race |
      white  |   .2151541   .0405659     5.30   0.000     .1356464    .2946619
      black  |   .4129277   .0922105     4.48   0.000     .2321984     .593657
      other  |   .4231076   .0616934     6.86   0.000     .3021908    .5440244
------------------------------------------------------------------------------

. 
. test (1.race=2.race) (1.race=3.race)

 ( 1)  1bn.race - 2.race = 0
 ( 2)  1bn.race - 3.race = 0

           chi2(  2) =    9.07
         Prob > chi2 =    0.0107

Announcement

Estimating probability using margins after mixed effect logistic regression

Comment

Comment

Comment