Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Why do Poisson and Negative Binomial Regressions yield the same result?

    I am using the count data models as I have a count dependent variable (0, 1, 2, 3, 4).

    There is no overdispersion present and no assumption of the Poisson model is violated. Pearson goodness-of-fit and Deviane goodness-of-fit both suggest the usage of Poisson regression instead of NB regression.

    However, I am still interested to compare the results of both regressions and see if there are some changes. Surprisingly, the results are identical.

    My question would be: Why do the Poisson and Negative Binomial Regressions both give the same results? Is it because of no violations of Poisson model assumptions? What should be done in this case?

    Poisson results:
    dnewoccposition3 .0989237 .028555 3.46 0.001 .042957 .1548904
    dnewoccposition4 .1154875 .0467075 2.47 0.013 .0239424 .2070326
    dnewoccposition5 .1718835 .0493519 3.48 0.000 .0751555 .2686115
    NB results:
    dnewoccposition3 .0989237 .028555 3.46 0.001 .042957 .1548904
    dnewoccposition4 .1154875 .0467075 2.47 0.013 .0239424 .2070326
    dnewoccposition5 .1718835 .0493519 3.48 0.000 .0751555 .2686115

    Many thanks in advance!

    Best,
    Mehrzad

  • #2
    Mehrzad:
    yes, it is because there's no violation of the Poisson-related requirements (basically, equidispersion).
    I can replicate the issue in the following toy-example:
    Code:
    . use http://www.stata-press.com/data/r16/airline.dta
    
    . poisson injuries i.airline
    
    Iteration 0:   log likelihood = -16.084082 
    Iteration 1:   log likelihood = -15.980494 
    Iteration 2:   log likelihood = -15.980112 
    Iteration 3:   log likelihood = -15.980112 
    
    Poisson regression                              Number of obs     =          9
                                                    LR chi2(8)        =      31.86
                                                    Prob > chi2       =     0.0001
    Log likelihood = -15.980112                     Pseudo R2         =     0.4992
    
    ------------------------------------------------------------------------------
        injuries |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
         airline |
              2  |  -.4519851   .4834938    -0.93   0.350    -1.399616    .4956453
              3  |  -.4519851   .4834938    -0.93   0.350    -1.399616    .4956453
              4  |   .5465437   .3788676     1.44   0.149    -.1960232    1.289111
              5  |  -.2006707   .4494666    -0.45   0.655    -1.081609    .6802676
              6  |  -1.011601   .5838742    -1.73   0.083    -2.155973    .1327715
              7  |  -1.299283   .6513389    -1.99   0.046    -2.575884   -.0226821
              8  |  -2.397895   1.044466    -2.30   0.022    -4.445011   -.3507797
              9  |  -1.299283   .6513389    -1.99   0.046    -2.575884   -.0226821
                 |
           _cons |   2.397895   .3015113     7.95   0.000     1.806944    2.988847
    ------------------------------------------------------------------------------
    
    . nbreg injuries i.airline
    
    Fitting Poisson model:
    
    Iteration 0:   log likelihood = -16.084082 
    Iteration 1:   log likelihood = -15.980494 
    Iteration 2:   log likelihood = -15.980112 
    Iteration 3:   log likelihood = -15.980112 
    
    Fitting constant-only model:
    
    Iteration 0:   log likelihood = -27.260001 
    Iteration 1:   log likelihood = -26.029258 
    Iteration 2:   log likelihood =  -26.02905 
    Iteration 3:   log likelihood =  -26.02905 
    
    Fitting full model:
    
    Iteration 0:   log likelihood = -22.032816 
    Iteration 1:   log likelihood = -18.466959 
    Iteration 2:   log likelihood = -17.617989 
    Iteration 3:   log likelihood = -16.121217 
    Iteration 4:   log likelihood = -15.980749 
    Iteration 5:   log likelihood =  -15.98014 
    Iteration 6:   log likelihood = -15.980111 
    Iteration 7:   log likelihood = -15.980111  (not concave)
    Iteration 8:   log likelihood =  -15.98011 
    
    Negative binomial regression                    Number of obs     =          9
                                                    LR chi2(8)        =      20.10
    Dispersion     = mean                           Prob > chi2       =     0.0100
    Log likelihood =  -15.98011                     Pseudo R2         =     0.3861
    
    ------------------------------------------------------------------------------
        injuries |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
         airline |
              2  |  -.4519853   .4834938    -0.93   0.350    -1.399616    .4956452
              3  |  -.4519853   .4834938    -0.93   0.350    -1.399616    .4956452
              4  |   .5464595   .3788735     1.44   0.149    -.1961189    1.289038
              5  |  -.2006707   .4494666    -0.45   0.655    -1.081609    .6802676
              6  |  -1.011601   .5838742    -1.73   0.083    -2.155973    .1327715
              7  |  -1.299283    .651339    -1.99   0.046    -2.575884   -.0226821
              8  |  -2.397895   1.044466    -2.30   0.022    -4.445011   -.3507797
              9  |  -1.299283    .651339    -1.99   0.046    -2.575884   -.0226821
                 |
           _cons |   2.397895   .3015114     7.95   0.000     1.806944    2.988847
    -------------+----------------------------------------------------------------
        /lnalpha |  -19.25422   1404.153                     -2771.344    2732.835
    -------------+----------------------------------------------------------------
           alpha |   4.35e-09   6.10e-06                             0           .
    ------------------------------------------------------------------------------
    LR test of alpha=0: chibar2(01) = 4.7e-06              Prob >= chibar2 = 0.499
    
    .
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      To add to Carlo's helpful reply, let me emphasize that Poisson regression produces consistent estimators of the mean parameters with any kind of variance/mean relationship. Use -glm- with the fam(poisson) option and you'll be able to estimate the coefficient in the variance/mean relationship. I bet you'll find underdispersion. In this case, the negbin, fortunately, heads to the Poisson estimates, but this is not guaranteed. In more complicated settings, such as with panel data, I've seen xtnbreg hang up and fail to converge because the data are underdispersed. I've never seen this with the more robust xtpoisson estimator.

      You should rarely use nbreg. In fact, I can't think of a case where I'd use it over Poisson regression unless I'm really interested in computing probabilities. For estimating mean effects, Poisson is much preferred. And always use a robust variance matrix. In cases of underdispersion, it actually reduces the standard errors!

      Below is an example to a fertility data set (FERTIL2 with my introductory econometrics book). The estimate of the sigma parameter, which is based on the Pearson residuals, is 0.7859815. The nbreg estimates would not converge due do the underdispersion. You can see the robust standard errors are smaller than the usual MLE standard errors.

      Code:
      . glm children age c.age#c.age educ electric urban, fam(poisson) vce(robust)
      
      Iteration 0:   log pseudolikelihood = -6700.6931  
      Iteration 1:   log pseudolikelihood = -6586.2189  
      Iteration 2:   log pseudolikelihood = -6585.9102  
      Iteration 3:   log pseudolikelihood = -6585.9102  
      
      Generalized linear models                         Number of obs   =      4,358
      Optimization     : ML                             Residual df     =      4,352
                                                        Scale parameter =          1
      Deviance         =  4086.463495                   (1/df) Deviance =   .9389852
      Pearson          =  3420.591324                   (1/df) Pearson  =   .7859815
      
      Variance function: V(u) = u                       [Poisson]
      Link function    : g(u) = ln(u)                   [Log]
      
                                                        AIC             =     3.0252
      Log pseudolikelihood = -6585.910155               BIC             =  -32382.29
      
      ------------------------------------------------------------------------------
                   |               Robust
          children |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
               age |   .3665522   .0092319    39.70   0.000     .3484579    .3846465
                   |
       c.age#c.age |  -.0044698   .0001429   -31.28   0.000    -.0047499   -.0041897
                   |
              educ |  -.0259419   .0025496   -10.18   0.000     -.030939   -.0209449
          electric |  -.1512171   .0332707    -4.55   0.000    -.2164264   -.0860077
             urban |  -.0761091   .0206833    -3.68   0.000    -.1166476   -.0355707
             _cons |  -5.710309   .1461743   -39.07   0.000    -5.996805   -5.423812
      ------------------------------------------------------------------------------
      
      . glm children age c.age#c.age educ electric urban, fam(poisson)
      
      Iteration 0:   log likelihood = -6700.6931  
      Iteration 1:   log likelihood = -6586.2189  
      Iteration 2:   log likelihood = -6585.9102  
      Iteration 3:   log likelihood = -6585.9102  
      
      Generalized linear models                         Number of obs   =      4,358
      Optimization     : ML                             Residual df     =      4,352
                                                        Scale parameter =          1
      Deviance         =  4086.463495                   (1/df) Deviance =   .9389852
      Pearson          =  3420.591324                   (1/df) Pearson  =   .7859815
      
      Variance function: V(u) = u                       [Poisson]
      Link function    : g(u) = ln(u)                   [Log]
      
                                                        AIC             =     3.0252
      Log likelihood   = -6585.910155                   BIC             =  -32382.29
      
      ------------------------------------------------------------------------------
                   |                 OIM
          children |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
               age |   .3665522   .0096188    38.11   0.000     .3476996    .3854047
                   |
       c.age#c.age |  -.0044698   .0001422   -31.44   0.000    -.0047484   -.0041911
                   |
              educ |  -.0259419   .0028118    -9.23   0.000    -.0314529    -.020431
          electric |  -.1512171   .0347648    -4.35   0.000    -.2193548   -.0830794
             urban |  -.0761091   .0215737    -3.53   0.000    -.1183927   -.0338255
             _cons |  -5.710309   .1596396   -35.77   0.000    -6.023197   -5.397421
      ------------------------------------------------------------------------------


      Comment


      • #4
        Dear Carlo and Jeff,

        many thanks for the helpful and prompt replies. It helped me understand the issue much better.

        Best regards,
        Mehrzad

        Comment


        • #5
          Dear Professor Jeff Wooldridge,

          Regarding your comment: "For estimating mean effects, Poisson is much preferred. And always use a robust variance matrix. In cases of underdispersion, it actually reduces the standard errors!" please could you explain mathematically how this is the case? That would be immensely helpful in order to better understand how vce(robust) affects standard error estimates in the case of a Poisson distribution.

          Many thanks in advance!

          Comment

          Working...
          X