Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Poisson Regression or Negative Binomial models?

    Dear Stata List -

    I would like to compare mortality rates (per number of live births as the population) by region, using poisson regression. I have data on number of deaths for 12 regions and populations for each region.

    I am using the following command

    xi: poisson deaths ib10.region , exposure (livebirths) irr

    region 10 is the region I am using as the reference group (it has the lowest mortality)

    I am unsure if I need to use a negative binomial model instead though - I know that in poisson models the mean should be equal to the variance - but am unsure how to look at this. Should this be the mean and variance of the mortality rate in the whole population?

    Further, when I ran the negative binomial model - I get exactly the same result using

    xi: nbreg deaths ib10.region, dispersion(mean) exposure(livebirths) irr

    Any suggestions as to why this is the case?

    Best Wishes

    Joe

  • #2
    Joe:
    if you detect overdispersion after -poisson-, you should go -nbreg-. You probably mean that with both models the point estimates of the coefficients are the same. However, overdispersion relates to standard errors, whereas coefficients do not differ.
    As an aside, please note that -xi:- prefix is redundant for Stata built-in commands.
    ​​​
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      If you search this forum, you'll find some disagreement on when to use poisson or xtpoisson and when to use nbreg or xtnbreg. I think some other folks have suggested poisson may still be preferable even with over dispersion. I don't claim sufficient understanding to offer an opinion. By the way, it sounds to me like you have panel data so an xt estimator might be appropriate.

      Comment


      • #4
        Joe: Three comments:

        1. The data structure you describe is analogous to the famous von Bortkiewicz "deaths by mule kick" data that are often used to motivate why a negative binomial model might be preferred to Poisson (deaths across army divisions over time). Of course this all depends on the particulars of the data, so there can be no one-size-fits-all recommendation here. http://www.datavis.ca/courses/grcat/grc1.html

        2. If your only concern is with the parameter estimates or marginal effects associated with the conditional mean, the Poisson and NB should deliver approximately the same results. On the other hand if your interests were in marginal effects for each cell's conditional probability (Pr(y=0|x), Pr(y=1|x), etc.) then Poisson and NB differences may be more prominent.

        3. The mean=variance result in Poisson regression models is with respect to the conditional mean and conditional variance (E[y|x]=var[y|x]) not with respect to the marginal means and variances (E[y]=var[y]). This pertains to your question Should this be the mean and variance of the mortality rate in the whole population?

        Comment


        • #5
          Dear all

          Thank you all for these very useful comments! To clarify - the data are cross-sectional i.e. mortality outcomes in 2016 in different regions. With regards to determining whether the mean for infant mortality rate (imr) is equal to the variance I am using

          sum imr, detail

          Mean 3.857144
          Variance .8864732

          So the data are not over dispersed, but the variance is much smaller than the mean. Would this imply I need to use the negative binomial model, or it sounds as if there is some disagreement about this?

          The outputs from

          xi: poisson deaths ib10.region , exposure (livebirths) irr
          and
          xi: nbreg deaths ib10.region, dispersion(mean) exposure(livebirths) err

          are exactly the same (estimates and standard errors) - so i think I must be doing something wrong with the commands? I have attached the dataset if you would like to take a look:

          https://www.dropbox.com/s/k1oqfmf4fom22p4/imr.dta?dl=0

          Thank you all again

          Best Wishes
          Joe

          Comment


          • #6
            Joe:
            if your data are underdispersed, you can follow the advice of https://www.stata.com/bookstore/modeling-count-data/ (Chapter 8).
            As an aside, please note that -xi:- prefix is redundant for Stata built-in commands.
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              Hi Carlo

              thanks for this - Ill read through the book suggestion

              Best Wishes

              Joe

              Comment

              Working...
              X