Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Negative binomial regression: Variance of counts versus "model dispersion"

    Dear Statalisters,

    I have been trying to properly understand intuition and technicalities behind the so called “NB1” and “NB2” negative binomial regression models (Cameron and Trivedi, 1986 and 2013), which are implemented by Stata respectively with the nbreg, ´dispersion(constant)´ and ´nbreg, dispersion(mean)´ commands.

    One hurdle I face is understanding the meaning of the terms “dispersion for the jth observation” and “model dispersion”, which are used interchangeably the Stata manual for nbreg (“Introduction to negative binomial regression”) and in this Stata FAQ page https://www.stata.com/support/faqs/s...ance-function/

    More specifically, it seems to be the case that:

    FOR MODEL NB2 (“mean dispersion”), the variance of event counts is Var(y_j) = mu_j * (1 + alpha * mu_j) and the so-called "model dispersion" or "dispersion for the jth observation" is (1 + alpha* mu_j)

    FOR MODEL NB1 (“constant dispersion”), the variance of event counts is Var(y_j) = mu_j * (1 + delta) and the so-called "model dispersion" or "dispersion for the jth observation" is (1 + delta)

    So while I understand how the variance of counts is derived, I do not understand what those "dispersion" terms exactly refer to. What is the "dispersion for the jth observation" or "model dispersion", and how is it derived? Cameron and Trivedi do not seem to talk about it at all, in their book.

    It must be an important concept, since those terms are giving the name to the two model options in the Stata literature ("mean" and "constant" dispersion models). On the other hand, the variance of observed counts depends on the mean mu_j in BOTH models, so none of them seems to have "constant dispersion" in that sense.

    Thank you very much in advance to those who will help!

    Zelda

  • #2
    Dear Zelda,

    You can interpret the "model dispersion" or "dispersion for the jth observation" as the over-dispersion factor, when comparing with a Poisson distribution where the variance is equal to the mean.

    Best wishes,

    Joao
    Last edited by Joao Santos Silva; 20 Jan 2018, 04:33.

    Comment


    • #3
      Dear Joao,

      thank you very much for your reply!

      Your concise answer is very useful because it has simplified my thinking by a lot. In practice, the NB2 model allows for the extra-Poisson dispersion to depend on the mean of observed outcomes of each observation, while the NB1 model restricts the extra-Poisson dispersion to be the same across all observations.

      On top of the extra-Poisson dispersion, the count dispersion of both models also includes the original Poisson-dispersion mu_j (which is the original Poisson parameter, mean = variance), and this makes a lot of sense.

      What I am still missing is the "formal" explanation as to why you need to multiply together the two types of dispersion to obtain the variance of counts... but probably it is not that important.

      Thank you very much again!

      Zelda

      Comment


      • #4
        Dear Zelda,

        For that, please see the nbreg entry on the Stata Manual, especially page 1626 (or a book on count data). Notice that there are different ways of obtaining these models as generalization of the Poisson, and in different cases the interpretation of the over-dispersion will be different. If you really want to know all the details, please check this book.

        Best wishes,

        Joao

        Comment


        • #5
          Great, I will check those references!

          Thank you very much again for your time!

          Best wishes,

          Zelda

          Comment

          Working...
          X