Poisson Model versus Negative Binomial Model with Equidispersion

Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#16

01 Sep 2017, 04:37

Thank you for sharing the information demanded in #13.

It is clearly an underdispersion issue, for the Pearson statistic for dispersion is quite low.

If I were to speculate about the reason for underdispersion, the tabulation in #11 "unveils" that around 50% of the counts are clamped in just 2 values (3 and 4).

To end, I have never faced myself with an underdispersed model. This is to say that my advice henceforth is just based on literature.

That said, scaling the SEs may be helpful.

By the way, I gather the clustered robust vce estimation was the reason for you to (surprisingly) "succeed" in delving with a negative binomial analysis for the underdispersed data.

To end, I recommend to use a generalized Poisson model, instead.

For this, you may wish to install the SJ - gpoisson -, whose authors are Zhao Yang and James W. Hardin.

Hopefully that helps.

Best regards,

Marcos
1 like
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3010
#17

01 Sep 2017, 12:30

Dear Elio,

Here are some additional comments.

1) If you multiply y by a positive constant k, the mean of y is multiplied by k and the variance by k². So, if y does not have a natural scale, like counts do, we can always choose a scale that gives the desired amount of over-, under-, or equi-dispersion.

2) I would say that the best way to model you data is to do binomial pseudo ML. For that you can either rescale your data and use fractional regression, or the glm command with a logistic link and the binomial family.

Best wishes,

Joao
1 like
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2156
#18

01 Sep 2017, 13:05

Elio: I concur with Joao that you should use binomial regression because you have a natural, known upper bound. The binomial distribution is a member of the linear exponential family, and so it has the same consistency properties as Poisson regression. You should not use negative binomial regression for the reasons previously stated. It has no robustness properties when the variance-mean relationship is violated.

I discuss binomial regression in Section 18.3.2 of my MIT Press book, "Econometric Analysis of Cross Section and Panel Data," 2e.

In Stata, using logit for the probability function:

Code:

glm y x1 x2 ... xk, fam(bin 8) link(logit) vce(robust) margins, dydx(*)

The -margins- command will give the average partial effects on the mean of y.

JW
2 likes
Comment
Elio Bolliger

Join Date: Aug 2017

Posts: 9
#19

04 Sep 2017, 01:08

Dear all,

I feel honored for all the advices and answers you have given me and I really appreciate the time you have invested in explaining the issues related to the question and the models used.

Marcos Almeida Thanks for the answer and I will have a look at the command written by Zhao Yang and James W. Hardin.

Joao Santos Silva I am thankful for your explanations regarding the scaling and the many advices including the model you suggested in your answer.

Jeff Wooldridge I really thank you for your contribution and the confirmation of the advice from Joao. I will gladly study your section about Cross Section and Panel Data in your book.

Best wishes,
Elio
Comment

Announcement

Comment

Comment

Comment

Comment