Poisson Regression or Negative Binomial models?

Joe Ward

Join Date: Jun 2015

Posts: 45
#1

Poisson Regression or Negative Binomial models?

16 Aug 2018, 07:12

Dear Stata List -

I would like to compare mortality rates (per number of live births as the population) by region, using poisson regression. I have data on number of deaths for 12 regions and populations for each region.

I am using the following command

xi: poisson deaths ib10.region , exposure (livebirths) irr

region 10 is the region I am using as the reference group (it has the lowest mortality)

I am unsure if I need to use a negative binomial model instead though - I know that in poisson models the mean should be equal to the variance - but am unsure how to look at this. Should this be the mean and variance of the mortality rate in the whole population?

Further, when I ran the negative binomial model - I get exactly the same result using

xi: nbreg deaths ib10.region, dispersion(mean) exposure(livebirths) irr

Any suggestions as to why this is the case?

Best Wishes

Joe
Tags: None

1 like
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17726
#2

16 Aug 2018, 15:22

Joe:
if you detect overdispersion after -poisson-, you should go -nbreg-. You probably mean that with both models the point estimates of the coefficients are the same. However, overdispersion relates to standard errors, whereas coefficients do not differ.
As an aside, please note that -xi:- prefix is redundant for Stata built-in commands.

Kind regards,
Carlo
(Stata 19.0)
Comment
Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#3

17 Aug 2018, 12:14

If you search this forum, you'll find some disagreement on when to use poisson or xtpoisson and when to use nbreg or xtnbreg. I think some other folks have suggested poisson may still be preferable even with over dispersion. I don't claim sufficient understanding to offer an opinion. By the way, it sounds to me like you have panel data so an xt estimator might be appropriate.
Comment
John Mullahy

Join Date: Dec 2016

Posts: 752
#4

18 Aug 2018, 06:20

Joe: Three comments:

1. The data structure you describe is analogous to the famous von Bortkiewicz "deaths by mule kick" data that are often used to motivate why a negative binomial model might be preferred to Poisson (deaths across army divisions over time). Of course this all depends on the particulars of the data, so there can be no one-size-fits-all recommendation here. http://www.datavis.ca/courses/grcat/grc1.html

2. If your only concern is with the parameter estimates or marginal effects associated with the conditional mean, the Poisson and NB should deliver approximately the same results. On the other hand if your interests were in marginal effects for each cell's conditional probability (Pr(y=0|x), Pr(y=1|x), etc.) then Poisson and NB differences may be more prominent.

3. The mean=variance result in Poisson regression models is with respect to the conditional mean and conditional variance (E[y|x]=var[y|x]) not with respect to the marginal means and variances (E[y]=var[y]). This pertains to your question Should this be the mean and variance of the mortality rate in the whole population?
2 likes
Comment
Joe Ward

Join Date: Jun 2015

Posts: 45
#5

20 Aug 2018, 08:35

Dear all

Thank you all for these very useful comments! To clarify - the data are cross-sectional i.e. mortality outcomes in 2016 in different regions. With regards to determining whether the mean for infant mortality rate (imr) is equal to the variance I am using

sum imr, detail

Mean 3.857144
Variance .8864732

So the data are not over dispersed, but the variance is much smaller than the mean. Would this imply I need to use the negative binomial model, or it sounds as if there is some disagreement about this?

The outputs from

xi: poisson deaths ib10.region , exposure (livebirths) irr
and
xi: nbreg deaths ib10.region, dispersion(mean) exposure(livebirths) err

are exactly the same (estimates and standard errors) - so i think I must be doing something wrong with the commands? I have attached the dataset if you would like to take a look:

https://www.dropbox.com/s/k1oqfmf4fom22p4/imr.dta?dl=0

Thank you all again

Best Wishes
Joe
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17726
#6

21 Aug 2018, 02:15

Joe:
if your data are underdispersed, you can follow the advice of https://www.stata.com/bookstore/modeling-count-data/ (Chapter 8).
As an aside, please note that -xi:- prefix is redundant for Stata built-in commands.

Kind regards,
Carlo
(Stata 19.0)
Comment
Joe Ward

Join Date: Jun 2015

Posts: 45
#7

03 Sep 2018, 03:55

Hi Carlo

thanks for this - Ill read through the book suggestion

Best Wishes

Joe
Comment

Announcement

Poisson Regression or Negative Binomial models?

Comment

Comment

Comment

Comment

Comment

Comment