Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Analysis of GLM Negative Binomial Coefficients

    I have opted to use GLM regression with a Negative Binomial probability distribution family, using the following coding:
    glm (DV) (IVs), family(nbinomial) link(log)


    I cannot find a tutorial on how to interpret the coefficients when the family is nbinomial. Do I interpret the model the same as nbreg?
    For my first IV, the coefficient is: 0.877***

    One tutorial has told me that this means for every one unit increase in IV, there is an 88% increase in the DV.
    I have also read that for every one unit increase in IV, the expected log count of the DV increases by 0.88.
    Are either of these correct? Is there a better way to analyse this?

    When running nbreg, I know that you have the option of , irr
    Say the coeffient is 2.19487, does this mean that for every percent change in the IV, there is 2.2% increase of the DV?
    Is the eform option for glm the same as irr?
    I.e.
    glm (DV) (IVs), family(nbinomial) link(log) eform

    I would much prefer to use GLM nbinomial to nbreg.

    Thank you.

  • #2
    I would much prefer to use GLM nbinomial to nbreg.
    Why? It is the exact same model, and the pdf documentation explicitly advises to use nbreg if you do not need a link other than log. Concerning the interpretation, you might find this helpful.

    Best
    Daniel

    Comment


    • #3
      I understand that the models are very similar but data is more supportive of my other model in terms of significance, when using GLM. Whilst the coefficients are similar, the p values do vary quite substantially. Do you interpret nbreg and GLM with nbinomial the same?

      Comment


      • #4
        Since the models are the same, you interpret them in the same way. It seems odd that the same models would give substantially different answers, though. I would not trust these standard errors, unless I fully understand why the differences arise and what exactly is going on. Otherwise picking results that you seek does not seem justified.

        Best
        Daniel

        Comment


        • #5
          A different maximizer is used for -nbreg- and -glm- so the convergence is assessed differently.

          Code:
          use http://www.ats.ucla.edu/stat/stata/notes/lahigh, clear
          
          generate female = (gender == 1)
          
          nbreg daysabs mathnce langnce female, irr trace
          glm daysabs mathnce langnce female, eform fam(nbinom) trace

          Taking the model with the 'best' p-values is highly discouraged.... The default maximizer for -nbreg- has been specifically chosen to deal with issues inherent in that model, whereas -glm- is well, general.
          Last edited by Andrew Lover; 23 Aug 2015, 05:41.
          __________________________________________________ __
          Assistant Professor, Department of Biostatistics and Epidemiology
          School of Public Health and Health Sciences
          University of Massachusetts- Amherst

          Comment


          • #6
            nbreg and glm, fam(nbinomial) are not the same. The former estimates a shape parameter -- so it is using a two-parameter family where x*b is the second "parameter." The glm option actually implements the Geometric distribution; it sets the variance to [E(y|x)] + [E(y|x)]^2.

            There are pros and cons of each. Because the Geometric distribution is the the linear exponential family, the glm is fully robust to distributional misspecification (except for the mean, of course). So the Geometric is like the Poisson in that regard.

            I actually prefer glm with either the poisson or nbinomial option because of the robustness. However, one needs to report robust standard errors. Assuming either the Poisson or Geometric variances are correct is much too restrictive. In fact, I would tend to report the robust standard errors for nbreg, too. The downside to doing this is that you are admitting your estimates might be inconsistent.

            nbreg of course is more efficient if all of the NB distribution holds. But it is not robust. I would actually try glm with the poisson and nbinomial options and robust standard errors. Hopefully, they will be close. If they are not, the mean is likely misspecified. Oh, and it's possible, especially for the Geometric family, that the robust standard errors are actually smaller than the nonrobust ones.

            Comment


            • #7
              Thanks for highlighting that difference Jeff- perhaps this should edited? http://www.ats.ucla.edu/stat/stata/dae/nbreg.htm
              - You can also run a negative binomial model using the glm command with the log link and the binomial family.
              - You will need to use the glm command to obtain the residuals to check other assumptions of the negative binomial model (see Cameron and Trivedi (1998) and Dupont (2002) for more information).
              I've generally followed Dupont's suggestions running via glm to get residuals, and assumed the observed differences were due to the maximizer.Maybe now with -margins- it's less relevant?

              Code:
              use http://www.ats.ucla.edu/stat/stata/notes/lahigh, clear
              
              generate female = (gender == 1)
              
              nbreg daysabs mathnce langnce female, irr vce(robust)
              
              glm daysabs mathnce langnce female, eform fam(nbinom) vce(robust)
              __________________________________________________ __
              Assistant Professor, Department of Biostatistics and Epidemiology
              School of Public Health and Health Sciences
              University of Massachusetts- Amherst

              Comment


              • #8
                Hi Andrew: Yes, that description is misleading and should be changed. On the first point, the estimation methods are different, as I described above. The example you gave is an interesting one because the estimates are so similar. I think it's because the estimated alpha is relatively close to unity, although I would've expected to see some larger differences.

                Here's another one to try where alpha is estimated to be close to zero, and the coefficient estimates are notably different. (I tend to focus on the coefficients themselves.)

                Code:
                use http://fmwww.bc.edu/ec-p/data/wooldridge/fertil2, clear
                glm children age electric radio tv bicycle educ, family(nbinomial) link(log) robust
                nbreg children age electric radio tv bicycle educ, vce(robust)

                Getting the residuals from glm can be very misleading, as they set alpha equal to unity, and so those residuals will have heteroskedasticity (at a minimum) if alpha is far from unity. I think they should be computed "by hand" after using nbreg, if you are interested in testing the assumptions of the underlying model for nbreg.

                Comment

                Working...
                X