Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • marginal means from binreg and CI

    I have two questions about calculating marginal means after binreg. Here's a toy example:

    Code:
    . clear
    
    . input t tries num_correct
    
                  t       tries  num_corr~t
      1. 1 2     1
      2. 2 1 1
      3. 3 3 1
      4. 4 1     1
      5. 5 1 0
      6. end
    
    . tsset t
    
    Time variable: t, 1 to 5
            Delta: 1 unit
    
    . // nwest gallant anderson kernels
    . binreg num_correct c.t, n(tries) vce(hac nwest 1)
    
    Iteration 1:   deviance =  4.203517
    Iteration 2:   deviance =  4.198491
    Iteration 3:   deviance =  4.198491
    
    Generalized linear models                         Number of obs   =          5
    Optimization     : MQL Fisher scoring             Residual df     =          3
                       (IRLS EIM)                     Scale parameter =          1
    Deviance         =    4.1984907                   (1/df) Deviance =   1.399497
    Pearson          =  3.170055626                   (1/df) Pearson  =   1.056685
    
    Variance function: V(u) = u*(1-u/tries)           [Binomial]
    Link function    : g(u) = ln(u/(tries-u))         [Logit]
    
    HAC kernel (lags): Newey–West (1)                 BIC             =   -.629823
    
    ------------------------------------------------------------------------------
                 |                 HAC
     num_correct | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
               t |  -.3041968    .267244    -1.14   0.255    -.8279855    .2195918
           _cons |   .8359291   .6811638     1.23   0.220    -.4991274    2.170986
    ------------------------------------------------------------------------------
    
    . preserve
    
    .         replace tries = 1
    (2 real changes made)
    
    .         margins, at(t=(6(1)10))
    
    Adjusted predictions                                         Number of obs = 5
    Model VCE: HAC Newey–West 1
    
    Expression: Predicted mean num_correct, predict()
    1._at: t =  6
    2._at: t =  7
    3._at: t =  8
    4._at: t =  9
    5._at: t = 10
    
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |     Margin   std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
             _at |
              1  |   .2710599   .1909681     1.42   0.156    -.1032308    .6453506
              2  |   .2152697   .2077495     1.04   0.300    -.1919119    .6224513
              3  |    .168311   .2092049     0.80   0.421     -.241723     .578345
              4  |   .1299001   .1989235     0.65   0.514    -.2599829    .5197831
              5  |   .0992095     .18105     0.55   0.584     -.255642     .454061
    ------------------------------------------------------------------------------
    
    .        
    .         quietly margins, at(t=(6(1)10)) predict(xb)
    
    .         transform_margins logistic(@)
    ----------------------------------------------
                 |         b         ll         ul
    -------------+--------------------------------
             _at |
              1  |  .2710599    .052972   .7119886
              2  |  .2152697   .0240371   .7534184
              3  |   .168311   .0106992   .7910949
              4  |  .1299001   .0047194   .8245746
              5  |  .0992095   .0020728   .8537953
    ----------------------------------------------
    
    .        
    .         replace tries = 2
    (5 real changes made)
    
    .         margins, at(t=(6(1)10))
    
    Adjusted predictions                                         Number of obs = 5
    Model VCE: HAC Newey–West 1
    
    Expression: Predicted mean num_correct, predict()
    1._at: t =  6
    2._at: t =  7
    3._at: t =  8
    4._at: t =  9
    5._at: t = 10
    
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |     Margin   std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
             _at |
              1  |   .5421198   .3819363     1.42   0.156    -.2064615    1.290701
              2  |   .4305394   .4154991     1.04   0.300    -.3838238    1.244903
              3  |   .3366219   .4184097     0.80   0.421    -.4834461     1.15669
              4  |   .2598002   .3978471     0.65   0.514    -.5199657    1.039566
              5  |    .198419      .3621     0.55   0.584    -.5112839    .9081219
    ------------------------------------------------------------------------------
    
    .         quietly margins, at(t=(6(1)10)) predict(xb)
    
    .         transform_margins logistic(@)
    ----------------------------------------------
                 |         b         ll         ul
    -------------+--------------------------------
             _at |
              1  |  .2710599    .052972   .7119886
              2  |  .2152697   .0240371   .7534184
              3  |   .168311   .0106992   .7910949
              4  |  .1299001   .0047194   .8245746
              5  |  .0992095   .0020728   .8537953
    ----------------------------------------------
    
    . restore
    My two questions are:

    1) Is there a better way to calculate marginal means for various values of trials than using replace?
    2) Since the delta method CIs include impossible values, I tried to use Jeff Pitblado's transform_margins command. The MMs match, but I cannot get it to work for any value of trials other than 1. The last set of margins looks just like the penultimate one, whereas the two delta method ones behave as I would expect: second is the first times 2.

    I am using Stata 17 and version 1.0.0 31jul2015 of transform_margins. Stata code is

    Code:
    clear
    input t tries num_correct
    1 2    1
    2 1 1
    3 3 1
    4 1    1
    5 1 0
    end
    tsset t
    // nwest gallant anderson kernels
    binreg num_correct c.t, n(tries) vce(hac nwest 1)
    preserve
        replace tries = 1
        margins, at(t=(6(1)10))
        
        quietly margins, at(t=(6(1)10)) predict(xb)
        transform_margins logistic(@)
        
        replace tries = 2
        margins, at(t=(6(1)10))    
        quietly margins, at(t=(6(1)10)) predict(xb)
        transform_margins logistic(@)
    restore
    Last edited by Dimitriy V. Masterov; 17 Jul 2024, 23:33. Reason: binreg

  • #2
    Originally posted by Dimitriy V. Masterov View Post
    2) Since the delta method CIs include impossible values, I tried to use Jeff Pitblado's transform_margins command. The MMs match, but I cannot get it to work for any value of trials other than 1. The last set of margins looks just like the penultimate one, whereas the two delta method ones behave as I would expect: second is the first times 2.
    The linear predictions [option -predict(xb)-] do not depend on the values of the variable "tries", as it is not one of the covariates in the model. You can see this by deleting the quietly command in your code. Jeff's command picks up the estimates from the last margins command that was executed.


    1) Is there a better way to calculate marginal means for various values of trials than using replace?
    Use the -over()- option of margins.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(t tries num_correct)
    1 2 1
    2 1 1
    3 3 1
    4 1 1
    5 1 0
    end
    
    tsset t
    *nwest gallant anderson kernels
    binreg num_correct c.t, n(tries) vce(hac nwest 1)
    margins, at(t=(6(1)10)) over(tries) noatlegend
    Res.:

    Code:
    . margins, at(t=(6(1)10)) over(tries) noatlegend
    
    Adjusted predictions                                         Number of obs = 5
    Model VCE: HAC Newey–West 1
    
    Expression: Predicted mean num_correct, predict()
    Over:       tries
    
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |     Margin   std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
       _at#tries |
            1 1  |   .2710599   .1909681     1.42   0.156    -.1032308    .6453506
            1 2  |   .5421198   .3819363     1.42   0.156    -.2064615    1.290701
            1 3  |   .8131797   .5729044     1.42   0.156    -.3096923    1.936052
            2 1  |   .2152697   .2077495     1.04   0.300    -.1919119    .6224513
            2 2  |   .4305394   .4154991     1.04   0.300    -.3838238    1.244903
            2 3  |   .6458091   .6232486     1.04   0.300    -.5757358    1.867354
            3 1  |    .168311   .2092049     0.80   0.421     -.241723     .578345
            3 2  |   .3366219   .4184097     0.80   0.421    -.4834461     1.15669
            3 3  |   .5049329   .6276146     0.80   0.421    -.7251691    1.735035
            4 1  |   .1299001   .1989235     0.65   0.514    -.2599829    .5197831
            4 2  |   .2598002   .3978471     0.65   0.514    -.5199657    1.039566
            4 3  |   .3897004   .5967706     0.65   0.514    -.7799486    1.559349
            5 1  |   .0992095     .18105     0.55   0.584     -.255642     .454061
            5 2  |    .198419      .3621     0.55   0.584    -.5112839    .9081219
            5 3  |   .2976285     .54315     0.55   0.584    -.7669259    1.362183
    ------------------------------------------------------------------------------
    
    .

    Comment


    • #3
      Thanks again, Andrew Musau! The over() approach is clever. It does get tricky when you need to integrate over the remaining covariates (which I don't have here) or extrapolate to unobserved values.

      Here's what I wound up doing. To fix (2), I used
      Code:
       
       transform_margins logistic(@)*x
      to get the expected count with x tries.

      The output from the three approaches looks like this:
      Click image for larger version

Name:	Screen Shot 2024-07-18 at 5.54.30 PM.png
Views:	1
Size:	125.0 KB
ID:	1759186


      The transform_margins with rescaling seems to give the best CIs, but all three give the same predictions.

      Here's my code:

      Code:
      cls
      clear
      
      // Install ado-files
      capture net install transform_margins, from("http://www.stata.com/users/jpitblado")
      capture ssc install combomarginsplot
      net install grc1leg,from("http://www.stata.com/users/vwiggins/")
      
      // Input the data
      input t tries num_correct
      1 2    1
      2 1 1
      3 3 1
      4 1    1
      5 1 0
      end
      
      // Set the data as time series, with 't' as the time variable
      tsset t
      
      // Run a binomial regression with HAC (Heteroskedasticity and Autocorrelation Consistent) standard errors
      // Note: Using Newey-West kernel with 1 lag, which specifies the maximum lag to be considered in the autocorrelation structure.
      //          Could use also use gallant or anderson kernels (not sure if one is better with small samples)
      binreg num_correct c.t, n(tries) vce(hac nwest 1) nolog
      margins, at(t=(6(1)10)) over(tries) saving(M, replace)
      combomarginsplot M, name(g1, replace) nodraw title("margins, over()", nospan) offset(.1) ytitle("Number Correct") 
      
      // Loop over Tries
      forvalues k = 1(1)3 {
          replace tries = `k'    
          di "`k' Tries"
          
          /* (A) Get predictions with delta-method CIs (which could be wonky) */
          // Calculate predicted num_correct for t=6 to t=10 with tries = k
          margins, at(t=(6(1)10)) saving(m`k', replace)
          
          /* (B) Get the same predictions with Realistic CIs */
          // Calculate linear predicted values for t=6 to t=10 with tries = k
          quietly margins, at(t=(6(1)10)) predict(xb)
          capture matrix drop n`k'
          transform_margins logistic(@)*`k', matrix(n`k')
          
          svmat n`k', names(col)
          rename (b ll ul) =`k'
      }
          
      
      combomarginsplot m1 m2 m3, ytitle("") name(g2, replace) nodraw title("margins+replace", nospan) offset(.1) 
      
      gen t1 = _n + 4.9
      gen t2 = _n + 5
      gen t3 = _n + 5.1
      
      tw ///
      (connected b1 t1) ///
      (connected b2 t2) ///
      (connected b3 t3) ///
      (rcap ll1 ul1 t1, lcolor(tt_blue)) ///
      (rcap ll2 ul2 t2, lcolor(tt_yellow)) ///
      (rcap ll3 ul3 t3, lcolor(tt_green)) ///
      , ytitle("") name(g3, replace) xtitle("t")  ///
      title("transform_margins+rescaling", size(*.85) nospan) nodraw
      
      grc1leg g1 g2 g3, legendfrom(g1) ycommon xcommon name(g3, replace) span rows(1) // altshrink

      Comment

      Working...
      X