different results for negative binomial regression for dispersion(mean/constant)

Ess Ami

Join Date: Feb 2017

Posts: 6
#1

different results for negative binomial regression for dispersion(mean/constant)

14 Feb 2017, 08:23

HI All, I want to ask when I use negative binomial regression on my data when I use the dispersion as mean the results are totally different when use the dispersion as constant, knowing that the results of the constant dispersion consistent with the literature opposite to the results of the dispersion as mean
can anyone tell me how I choose the right dispersion on what rules?
and why it is different?

another question my data contains count variable (Dependent variable) while the independent variables in the form of ratios is that the right regression or I should use multiple regression instead of binomial?
Tags: None
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2156
#2

14 Feb 2017, 14:49

You'll have a much better chance of getting a useful response if you show us output. For example, "totally different" can mean different things to different people. If I see the results I can probably make some suggestions.

JW
Comment
Ess Ami

Join Date: Feb 2017

Posts: 6
#3

14 Feb 2017, 22:02

thanks a lot for your reply, below the output of both mean and constant dispersion negative binomial regression,
Comment

Marcos Almeida

Join Date: Apr 2014
Posts: 4047

15 Feb 2017, 14:02

Hello Ess,

With regards to NB regression, the NB2 (also called "quadratic" or "traditional") - as the last term implies - is by far the most used NB regression, whereas the NB1 (or "linear") tends to be less used.

The main difference relates to the dispersion parameter, alpha, and the dispersion function. For the NB2 regression, the mean is squared..

To select from both models (apart from the fact the NB2 is, well, "traditional"), you may use the AIC and BIC, and the results of the - linktest - as well,

Below, a toy example, taken from the Stata Manual.

Code:

. webuse rod93

. generate logexp=ln(exposure)

. list in 1/5

     +----------------------------------------------------+
     |    cohort   age_mos   deaths   exposure     logexp |
     |----------------------------------------------------|
  1. | 1941-1949       0.5      168      278.4   5.629059 |
  2. | 1941-1949       2.0       48      538.8   6.289344 |
  3. | 1941-1949       4.5       63      794.4   6.677587 |
  4. | 1941-1949       9.0       89    1,550.8   7.346526 |
  5. | 1941-1949      18.0      102    3,006.0   8.008366 |
     +----------------------------------------------------+

. nbreg deaths i.cohort, offset(logexp) nolog vsquish

Negative binomial regression                    Number of obs     =         21
                                                LR chi2(2)        =       0.40
Dispersion     = mean                           Prob > chi2       =     0.8171
Log likelihood =  -131.3799                     Pseudo R2         =     0.0015

------------------------------------------------------------------------------
      deaths |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      cohort |
  1960-1967  |  -.2676187   .7237203    -0.37   0.712    -1.686084    1.150847
  1968-1976  |  -.4573957   .7236651    -0.63   0.527    -1.875753    .9609618
       _cons |  -2.086731    .511856    -4.08   0.000     -3.08995   -1.083511
      logexp |          1  (offset)
-------------+----------------------------------------------------------------
    /lnalpha |   .5939963   .2583615                      .0876171    1.100376
-------------+----------------------------------------------------------------
       alpha |   1.811212   .4679475                       1.09157    3.005295
------------------------------------------------------------------------------
LR test of alpha=0: chibar2(01) = 4056.27              Prob >= chibar2 = 0.000

. estat ic

Akaike's information criterion and Bayesian information criterion

-----------------------------------------------------------------------------
       Model |        Obs  ll(null)  ll(model)      df         AIC        BIC
-------------+---------------------------------------------------------------
           . |         21 -131.5819  -131.3799       4    270.7598   274.9379
-----------------------------------------------------------------------------
               Note: N=Obs used in calculating BIC; see [R] BIC note.

. linktest, nolog vsquish

Negative binomial regression                    Number of obs     =         21
                                                LR chi2(2)        =       6.52
Dispersion     = mean                           Prob > chi2       =     0.0383
Log likelihood = -105.29768                     Pseudo R2         =     0.0301

------------------------------------------------------------------------------
      deaths |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        _hat |  -1.204959   .8014458    -1.50   0.133    -2.775764    .3658458
      _hatsq |   .0932243   .0737771     1.26   0.206    -.0513761    .2378247
       _cons |   8.041267   2.090668     3.85   0.000     3.943633     12.1389
-------------+----------------------------------------------------------------
    /lnalpha |  -1.516523   .3238047                     -2.151169   -.8818777
-------------+----------------------------------------------------------------
       alpha |   .2194736   .0710666                      .1163481    .4140048
------------------------------------------------------------------------------
LR test of alpha=0: chibar2(01) = 252.25               Prob >= chibar2 = 0.000

. nbreg deaths i.cohort, offset(logexp) dispersion(constant) nolog vsquish

Negative binomial regression                    Number of obs     =         21
                                                LR chi2(2)        =       1.16
Dispersion     = constant                       Prob > chi2       =     0.5598
Log likelihood = -139.66914                     Pseudo R2         =     0.0041

------------------------------------------------------------------------------
      deaths |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      cohort |
  1960-1967  |  -.3180474   .4372491    -0.73   0.467     -1.17504    .5389452
  1968-1976  |   .1368621   .4420315     0.31   0.757    -.7295038    1.003228
       _cons |  -3.914143   .3613023   -10.83   0.000    -4.622282   -3.206003
      logexp |          1  (offset)
-------------+----------------------------------------------------------------
    /lndelta |   4.741895   .3590761                      4.038118    5.445671
-------------+----------------------------------------------------------------
       delta |   114.6512   41.16851                      56.71952    231.7527
------------------------------------------------------------------------------
LR test of delta=0: chibar2(01) = 4039.69              Prob >= chibar2 = 0.000

. estat ic

Akaike's information criterion and Bayesian information criterion

-----------------------------------------------------------------------------
       Model |        Obs  ll(null)  ll(model)      df         AIC        BIC
-------------+---------------------------------------------------------------
           . |         21 -140.2493  -139.6691       4    287.3383   291.5164
-----------------------------------------------------------------------------
               Note: N=Obs used in calculating BIC; see [R] BIC note.

. linktest, nolog vsquish

Negative binomial regression                    Number of obs     =         21
                                                LR chi2(2)        =       6.05
Dispersion     = mean                           Prob > chi2       =     0.0487
Log likelihood = -105.53742                     Pseudo R2         =     0.0278

------------------------------------------------------------------------------
      deaths |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        _hat |  -.6493917   .5734559    -1.13   0.257    -1.773345    .4745613
      _hatsq |   .0601398    .077084     0.78   0.435    -.0909422    .2112217
       _cons |   5.895926   .9812203     6.01   0.000      3.97277    7.819083
-------------+----------------------------------------------------------------
    /lnalpha |  -1.491153   .3221859                     -2.122626   -.8596807
-------------+----------------------------------------------------------------
       alpha |   .2251128   .0725282                      .1197168    .4232972
------------------------------------------------------------------------------
LR test of alpha=0: chibar2(01) = 277.68               Prob >= chibar2 = 0.000

As you remark:

The coefficients differ from both models, but the statistical significance are similar.

Both the AIC and BIC favour the NB2 model.

The - linktest - in this specific case didn't help much, but keep an eye on it, because in general it is very useful.

To end, a final caveat: this is just a toy example. Beware NB regression usually won't "behave" well under small samples.

Hopefully that helps!

Last edited by Marcos Almeida; 15 Feb 2017, 14:07.

Best regards,

Marcos

Comment

Ess Ami

Join Date: Feb 2017

Posts: 6
#5

15 Feb 2017, 23:04

thanks a lot for your full answers, so to end that the negative binomial with constant dispersion is the traditional one and is used,

i also tested the AIC and BIC it gets lower for the NB2 compared to NB1, this means that the NB2 is better right?
I have another question, my dependent variables includes a lot of zeros, so do I use the Zero inflated NB model?
as the results also differ from the NB2
Attached Files

Last edited by Ess Ami; 15 Feb 2017, 23:13.
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#6

16 Feb 2017, 03:19

Hello Emi,

The so called "traditional" is the NB2 model, the very first one whose output I share in #4.

Best regards,

Marcos
Comment
Ess Ami

Join Date: Feb 2017

Posts: 6
#7

16 Feb 2017, 04:36

Originally posted by Marcos Almeida View Post

Hello Emi,

The so called "traditional" is the NB2 model, the very first one whose output I share in #4.

do you mean that one

Negative binomial regression Number of obs = 21
LR chi2(2) = 0.40
Dispersion = mean Prob > chi2 = 0.8171
Log likelihood = -131.3799 Pseudo R2 = 0.0015

where
Dispersion = mean ?
but in my dataset and results the AIC and BIC was lower for the NB where dispersion is constant so is there something wrong with my steps ?
can I use zero inflated NB or multiple Regression instead of NB? knowing my dependent variable is count variable

Last edited by Ess Ami; 16 Feb 2017, 05:10.
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#8

16 Feb 2017, 08:58

To be claer, I'll divide your query in three topics.

As you see in #4, under NB1, the option - dispersion - is set to constant.

The AIC and BIC of NB2 do not necessarily need to be lower than NB1.

With regards to zero-inflated count models, before rushing into them, I kindly suggest to start by taking a close look at the Stata manual and the literature.

Best regards,

Marcos
Comment
Ess Ami

Join Date: Feb 2017

Posts: 6
#9

16 Feb 2017, 22:21

I got you now thanks a lot, but when I used the NB1 its results consistent with literature also, has lower AIC and BIC , but you just said it is not neccesarily to be lower, so on what criteria I judge NB1 or NB2 is the best for the data analysis to make sure my results are right

Last edited by Ess Ami; 16 Feb 2017, 22:42.
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#10

17 Feb 2017, 03:23

Hello Ess,

What I meant by saying "the AIC and BIC of NB2 do not necessarily need to be lower than NB1" was: you may compare the AIC and BIC of both models. One model doesn't necessarily need to be always "better" than the other.

That said, in general, the model with lower AIC and BIC are supposed to be "better" than the other. Also, I pointed to other criteria, the linktest.

Previous literature and consistence of results are logically very important criteria as well.

As I remarked, NB model may not perform well with small sample, Overfitting may always be a matter of concern in any model as well.

That said, you know it: no model will perform well if mispecified. For example, if it should ideally be a zero-inflated NB model,

Hopefully that helps.

Best regards,

Marcos
Comment
Ess Ami

Join Date: Feb 2017

Posts: 6
#11

17 Feb 2017, 22:29

really thanks a lot Almeida for your helpful comments
Best Wishes
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2156
#12

17 Feb 2017, 22:55

You seem to pretty clearly be interested in effects on the expectation, E(y|x). In that case, I would use Poisson regression with robust standard errors, as it is a fully robust estimator for estimating the conditional mean.

Code:

glm y x1 x2 ... xk, fam(poisson) robust

Having said that, because the NB1 and NB2 models have the same number of parameters, you can simply compare the log likelihoods. In your case, the NB1 fits the data better. But that doesn't mean it produces robust estimates of the mean effects. Poisson regression does.
Comment

Announcement