marginal means from binreg and CI

Dimitriy V. Masterov

Join Date: Mar 2014
Posts: 609

marginal means from binreg and CI

17 Jul 2024, 23:32

I have two questions about calculating marginal means after binreg. Here's a toy example:

Code:

. clear

. input t tries num_correct

              t       tries  num_corr~t
  1. 1 2     1
  2. 2 1 1
  3. 3 3 1
  4. 4 1     1
  5. 5 1 0
  6. end

. tsset t

Time variable: t, 1 to 5
        Delta: 1 unit

. // nwest gallant anderson kernels
. binreg num_correct c.t, n(tries) vce(hac nwest 1)

Iteration 1:   deviance =  4.203517
Iteration 2:   deviance =  4.198491
Iteration 3:   deviance =  4.198491

Generalized linear models                         Number of obs   =          5
Optimization     : MQL Fisher scoring             Residual df     =          3
                   (IRLS EIM)                     Scale parameter =          1
Deviance         =    4.1984907                   (1/df) Deviance =   1.399497
Pearson          =  3.170055626                   (1/df) Pearson  =   1.056685

Variance function: V(u) = u*(1-u/tries)           [Binomial]
Link function    : g(u) = ln(u/(tries-u))         [Logit]

HAC kernel (lags): Newey–West (1)                 BIC             =   -.629823

------------------------------------------------------------------------------
             |                 HAC
 num_correct | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
           t |  -.3041968    .267244    -1.14   0.255    -.8279855    .2195918
       _cons |   .8359291   .6811638     1.23   0.220    -.4991274    2.170986
------------------------------------------------------------------------------

. preserve

.         replace tries = 1
(2 real changes made)

.         margins, at(t=(6(1)10))

Adjusted predictions                                         Number of obs = 5
Model VCE: HAC Newey–West 1

Expression: Predicted mean num_correct, predict()
1._at: t =  6
2._at: t =  7
3._at: t =  8
4._at: t =  9
5._at: t = 10

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
         _at |
          1  |   .2710599   .1909681     1.42   0.156    -.1032308    .6453506
          2  |   .2152697   .2077495     1.04   0.300    -.1919119    .6224513
          3  |    .168311   .2092049     0.80   0.421     -.241723     .578345
          4  |   .1299001   .1989235     0.65   0.514    -.2599829    .5197831
          5  |   .0992095     .18105     0.55   0.584     -.255642     .454061
------------------------------------------------------------------------------

.        
.         quietly margins, at(t=(6(1)10)) predict(xb)

.         transform_margins logistic(@)
----------------------------------------------
             |         b         ll         ul
-------------+--------------------------------
         _at |
          1  |  .2710599    .052972   .7119886
          2  |  .2152697   .0240371   .7534184
          3  |   .168311   .0106992   .7910949
          4  |  .1299001   .0047194   .8245746
          5  |  .0992095   .0020728   .8537953
----------------------------------------------

.        
.         replace tries = 2
(5 real changes made)

.         margins, at(t=(6(1)10))

Adjusted predictions                                         Number of obs = 5
Model VCE: HAC Newey–West 1

Expression: Predicted mean num_correct, predict()
1._at: t =  6
2._at: t =  7
3._at: t =  8
4._at: t =  9
5._at: t = 10

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
         _at |
          1  |   .5421198   .3819363     1.42   0.156    -.2064615    1.290701
          2  |   .4305394   .4154991     1.04   0.300    -.3838238    1.244903
          3  |   .3366219   .4184097     0.80   0.421    -.4834461     1.15669
          4  |   .2598002   .3978471     0.65   0.514    -.5199657    1.039566
          5  |    .198419      .3621     0.55   0.584    -.5112839    .9081219
------------------------------------------------------------------------------

.         quietly margins, at(t=(6(1)10)) predict(xb)

.         transform_margins logistic(@)
----------------------------------------------
             |         b         ll         ul
-------------+--------------------------------
         _at |
          1  |  .2710599    .052972   .7119886
          2  |  .2152697   .0240371   .7534184
          3  |   .168311   .0106992   .7910949
          4  |  .1299001   .0047194   .8245746
          5  |  .0992095   .0020728   .8537953
----------------------------------------------

. restore

My two questions are:

1) Is there a better way to calculate marginal means for various values of trials than using replace?
2) Since the delta method CIs include impossible values, I tried to use Jeff Pitblado's transform_margins command. The MMs match, but I cannot get it to work for any value of trials other than 1. The last set of margins looks just like the penultimate one, whereas the two delta method ones behave as I would expect: second is the first times 2.

I am using Stata 17 and version 1.0.0 31jul2015 of transform_margins. Stata code is

Code:

clear
input t tries num_correct
1 2    1
2 1 1
3 3 1
4 1    1
5 1 0
end
tsset t
// nwest gallant anderson kernels
binreg num_correct c.t, n(tries) vce(hac nwest 1)
preserve
    replace tries = 1
    margins, at(t=(6(1)10))
    
    quietly margins, at(t=(6(1)10)) predict(xb)
    transform_margins logistic(@)
    
    replace tries = 2
    margins, at(t=(6(1)10))    
    quietly margins, at(t=(6(1)10)) predict(xb)
    transform_margins logistic(@)
restore

Last edited by Dimitriy V. Masterov; 17 Jul 2024, 23:33. Reason: binreg

Tags: binreg, margins

Andrew Musau

Join Date: Oct 2014
Posts: 10180

18 Jul 2024, 05:37

Originally posted by Dimitriy V. Masterov View Post

2) Since the delta method CIs include impossible values, I tried to use Jeff Pitblado's transform_margins command. The MMs match, but I cannot get it to work for any value of trials other than 1. The last set of margins looks just like the penultimate one, whereas the two delta method ones behave as I would expect: second is the first times 2.

The linear predictions [option -predict(xb)-] do not depend on the values of the variable "tries", as it is not one of the covariates in the model. You can see this by deleting the quietly command in your code. Jeff's command picks up the estimates from the last margins command that was executed.

1) Is there a better way to calculate marginal means for various values of trials than using replace?

Use the -over()- option of margins.

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float(t tries num_correct)
1 2 1
2 1 1
3 3 1
4 1 1
5 1 0
end

tsset t
*nwest gallant anderson kernels
binreg num_correct c.t, n(tries) vce(hac nwest 1)
margins, at(t=(6(1)10)) over(tries) noatlegend

Res.:

Code:

. margins, at(t=(6(1)10)) over(tries) noatlegend

Adjusted predictions                                         Number of obs = 5
Model VCE: HAC Newey–West 1

Expression: Predicted mean num_correct, predict()
Over:       tries

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
   _at#tries |
        1 1  |   .2710599   .1909681     1.42   0.156    -.1032308    .6453506
        1 2  |   .5421198   .3819363     1.42   0.156    -.2064615    1.290701
        1 3  |   .8131797   .5729044     1.42   0.156    -.3096923    1.936052
        2 1  |   .2152697   .2077495     1.04   0.300    -.1919119    .6224513
        2 2  |   .4305394   .4154991     1.04   0.300    -.3838238    1.244903
        2 3  |   .6458091   .6232486     1.04   0.300    -.5757358    1.867354
        3 1  |    .168311   .2092049     0.80   0.421     -.241723     .578345
        3 2  |   .3366219   .4184097     0.80   0.421    -.4834461     1.15669
        3 3  |   .5049329   .6276146     0.80   0.421    -.7251691    1.735035
        4 1  |   .1299001   .1989235     0.65   0.514    -.2599829    .5197831
        4 2  |   .2598002   .3978471     0.65   0.514    -.5199657    1.039566
        4 3  |   .3897004   .5967706     0.65   0.514    -.7799486    1.559349
        5 1  |   .0992095     .18105     0.55   0.584     -.255642     .454061
        5 2  |    .198419      .3621     0.55   0.584    -.5112839    .9081219
        5 3  |   .2976285     .54315     0.55   0.584    -.7669259    1.362183
------------------------------------------------------------------------------

.

Comment

Dimitriy V. Masterov

Join Date: Mar 2014
Posts: 609

18 Jul 2024, 18:55

Thanks again, Andrew Musau! The over() approach is clever. It does get tricky when you need to integrate over the remaining covariates (which I don't have here) or extrapolate to unobserved values.

Here's what I wound up doing. To fix (2), I used

Code:

 
 transform_margins logistic(@)*x

to get the expected count with x tries.

The output from the three approaches looks like this:

Click image for larger version

Name: Screen Shot 2024-07-18 at 5.54.30 PM.png
Views: 1
Size: 125.0 KB
ID: 1759186

The transform_margins with rescaling seems to give the best CIs, but all three give the same predictions.

Here's my code:

Code:

cls
clear

// Install ado-files
capture net install transform_margins, from("http://www.stata.com/users/jpitblado")
capture ssc install combomarginsplot
net install grc1leg,from("http://www.stata.com/users/vwiggins/")

// Input the data
input t tries num_correct
1 2    1
2 1 1
3 3 1
4 1    1
5 1 0
end

// Set the data as time series, with 't' as the time variable
tsset t

// Run a binomial regression with HAC (Heteroskedasticity and Autocorrelation Consistent) standard errors
// Note: Using Newey-West kernel with 1 lag, which specifies the maximum lag to be considered in the autocorrelation structure.
//          Could use also use gallant or anderson kernels (not sure if one is better with small samples)
binreg num_correct c.t, n(tries) vce(hac nwest 1) nolog
margins, at(t=(6(1)10)) over(tries) saving(M, replace)
combomarginsplot M, name(g1, replace) nodraw title("margins, over()", nospan) offset(.1) ytitle("Number Correct") 

// Loop over Tries
forvalues k = 1(1)3 {
    replace tries = `k'    
    di "`k' Tries"
    
    /* (A) Get predictions with delta-method CIs (which could be wonky) */
    // Calculate predicted num_correct for t=6 to t=10 with tries = k
    margins, at(t=(6(1)10)) saving(m`k', replace)
    
    /* (B) Get the same predictions with Realistic CIs */
    // Calculate linear predicted values for t=6 to t=10 with tries = k
    quietly margins, at(t=(6(1)10)) predict(xb)
    capture matrix drop n`k'
    transform_margins logistic(@)*`k', matrix(n`k')
    
    svmat n`k', names(col)
    rename (b ll ul) =`k'
}
    

combomarginsplot m1 m2 m3, ytitle("") name(g2, replace) nodraw title("margins+replace", nospan) offset(.1) 

gen t1 = _n + 4.9
gen t2 = _n + 5
gen t3 = _n + 5.1

tw ///
(connected b1 t1) ///
(connected b2 t2) ///
(connected b3 t3) ///
(rcap ll1 ul1 t1, lcolor(tt_blue)) ///
(rcap ll2 ul2 t2, lcolor(tt_yellow)) ///
(rcap ll3 ul3 t3, lcolor(tt_green)) ///
, ytitle("") name(g3, replace) xtitle("t")  ///
title("transform_margins+rescaling", size(*.85) nospan) nodraw

grc1leg g1 g2 g3, legendfrom(g1) ycommon xcommon name(g3, replace) span rows(1) // altshrink

Announcement

marginal means from binreg and CI

Comment

Comment