How to choose between Poisson Regression Model and Negative Binomial Model ?

HAN YUE

Join Date: Jun 2018

Posts: 9
#1

How to choose between Poisson Regression Model and Negative Binomial Model ?

24 Jun 2018, 06:42

The dependent variable is integer type. So I think I should use Poisson Regression Model or Negative Binomial Model. Firstly, I tried Poisson Regression Model, the results is as follows:

Then, I tried Negative Binomial Model, the result is as follows:

As we can see, the LR test of alpha=0 is significant, so I should use Negative Binomial Model. However, the Pseudo R2 of Negative Binomial Model (0.0393) is smaller than that of Poisson Regression Model (Pseudo R2=0.1254), that is to say, the goodness of fitting of Poisson Regression Model is bigger than Negative Binomial Model. So the results are contradictory, should somebody help me with this question? Which model should I use? Thank you very much!

Last edited by HAN YUE; 24 Jun 2018, 06:48.
Tags: None
HAN YUE

Join Date: Jun 2018

Posts: 9
#2

24 Jun 2018, 06:48

Help please!
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#3

24 Jun 2018, 07:21

Han:
you do not say if you performed -estat go- after -poisson- and, if so, what Stata gave you back.
If -estat go- shows sign of overdispersion, -poisson- is biased, no matter Pseudo R2 value, and you have to switch to -nbreg- (as it seem to be the case with your data).
At the top of that, it may also be that an improved specification of -nbreg- can increase the Pseudo R2 value.

Kind regards,
Carlo
(Stata 19.0)
Comment
Richard Williams

Join Date: Apr 2014

Posts: 5008
#4

24 Jun 2018, 07:23

I don't view them as contradictory. An incorrect model with a higher R^2 is not preferable to a correct model with a lower R^2. The SEs are also higher with nbreg, as they should be. I would use nbreg.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
1 like
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#5

24 Jun 2018, 07:43

Carlo and Richard gave excellent advice.

Please read the FAQ carefully.

There are recommendations about sharing data/command/output correctly. You will find advice to avoid snapshots. You will also find tips about posting an informative message. Finally, sending a second message, this Sunday, within 8 minutes of starting a thread can be taken as a bump. Please read the FAQ about (not) bumping.

With regards to your question, if I understood correctly, it relates to the core knowledge concerning count models, and the information is widely found in textbooks, web, pages, manuals, etc. I truly believe that we can find it within a couple of minutes.

That being said, R-squared doesn’t imply good model fit.

You may wish to check the dispersion parameter with - glm - command. You may also make comparisons between different negative binomial models. But, as said before, you’ll find this information easily in the material above mentioned.

Please don’t bypass the important step of getting the basics of the theory about the model under use, otherwise obstacles and misinterpretation, let alone misspecified models shall be found galore

Best regards,

Marcos
Comment
HAN YUE

Join Date: Jun 2018

Posts: 9
#6

24 Jun 2018, 08:00

I am sorry, I am new to Regression Models, so I don't know exactly how to choose a valid model. Here are the commands and the attached is data.

. import excel "d:\data1.xlsx", sheet("Sheet1") firstrow

. *poisson y x1 x2 x3 x4 x5 x6 x7

Iteration 0: * log likelihood = -13480.539 *
Iteration 1: * log likelihood = -12439.367 *
Iteration 2: * log likelihood = -12401.522 *
Iteration 3: * log likelihood = *-12401.47 *
Iteration 4: * log likelihood = *-12401.47 *

Poisson regression * * * * * * * * * * * * * * *Number of obs * * = * * *1,760
* * * * * * * * * * * * * * * * * * * * * * * * LR chi2(7) * * * *= * *3555.51
* * * * * * * * * * * * * * * * * * * * * * * * Prob > chi2 * * * = * * 0.0000
Log likelihood = *-12401.47 * * * * * * * * * * Pseudo R2 * * * * = * * 0.1254

------------------------------------------------------------------------------
* * * * * *y | * * *Coef. * Std. Err. * * *z * *P>|z| * * [95% Conf. Interval]
-------------+----------------------------------------------------------------
* * * * * x1 | * .0310712 * .0033885 * * 9.17 * 0.000 * * .0244299 * *.0377125
* * * * * x2 | * -.000045 * 2.06e-06 * -21.79 * 0.000 * * -.000049 * -.0000409
* * * * * x3 | * .6493568 * .0343303 * *18.91 * 0.000 * * .5820706 * * .716643
* * * * * x4 | * .0000211 * 4.11e-07 * *51.36 * 0.000 * * .0000203 * *.0000219
* * * * * x5 | * 1.621192 * .1638862 * * 9.89 * 0.000 * * 1.299981 * *1.942403
* * * * * x6 | *-.5360483 * .1012103 * *-5.30 * 0.000 * *-.7344168 * -.3376797
* * * * * x7 | *-2.034442 * .1365581 * -14.90 * 0.000 * *-2.302091 * -1.766793
* * * *_cons | * 2.599774 * .0553757 * *46.95 * 0.000 * * 2.491239 * *2.708308
------------------------------------------------------------------------------

. nbreg y x1 x2 x3 x4 x5 x6 x7

Fitting Poisson model:

Iteration 0: log likelihood = -13480.539
Iteration 1: log likelihood = -12439.367
Iteration 2: log likelihood = -12401.522
Iteration 3: log likelihood = -12401.47
Iteration 4: log likelihood = -12401.47

Fitting constant-only model:

Iteration 0: log likelihood = -5921.2343
Iteration 1: log likelihood = -5917.4492
Iteration 2: log likelihood = -5917.4483
Iteration 3: log likelihood = -5917.4483

Fitting full model:

Iteration 0: log likelihood = -5784.0523
Iteration 1: log likelihood = -5708.8315
Iteration 2: log likelihood = -5684.6472
Iteration 3: log likelihood = -5684.6005
Iteration 4: log likelihood = -5684.6005

Negative binomial regression Number of obs = 1,760
LR chi2(7) = 465.70
Dispersion = mean Prob > chi2 = 0.0000
Log likelihood = -5684.6005 Pseudo R2 = 0.0393

------------------------------------------------------------------------------
y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x1 | .0261243 .0120297 2.17 0.030 .0025465 .0497022
x2 | -.0000276 5.42e-06 -5.10 0.000 -.0000382 -.000017
x3 | .6587957 .1114276 5.91 0.000 .4404015 .8771898
x4 | .0000763 4.83e-06 15.81 0.000 .0000669 .0000858
x5 | 1.015735 .6312224 1.61 0.108 -.2214377 2.252909
x6 | .4501755 .3480728 1.29 0.196 -.2320346 1.132386
x7 | .8477843 .4359345 1.94 0.052 -.0066316 1.7022
_cons | 1.495091 .1813951 8.24 0.000 1.139563 1.850619
-------------+----------------------------------------------------------------
/lnalpha | -.1735496 .0374455 -.2469414 -.1001577
-------------+----------------------------------------------------------------
alpha | .8406755 .0314795 .7811864 .9046947
------------------------------------------------------------------------------
LR test of alpha=0: chibar2(01) = 1.3e+04 Prob >= chibar2 = 0.000
Attached Files

data1.xlsx (99.7 KB, 1 view)
Comment
Richard Williams

Join Date: Apr 2014

Posts: 5008
#7

24 Jun 2018, 08:03

Here is a brief overview of count models if you need it. I am sure there are many others on the web.

https://www3.nd.edu/~rwilliam/stats3/CountModels.pdf

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#8

24 Jun 2018, 08:22

Han:
as per FAQ, please do not post attachments but share vwhat you typed and what Stata gave you back via CODE delimiters. Thanks.
Please also note that, due to the risk of active contents, most listers (me too) do not download spreadsheets from potentially unsafe sources. Thanks.
That said, the -estat gof- outcome after -poisson- (that you did not report in your last post) should have pointed you to the right regression model.

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

How to choose between Poisson Regression Model and Negative Binomial Model ?

Comment

Comment

Comment

Comment

Comment

Comment

Comment