Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • xtgee - nbinomial family distribution - estimates diverging (missing predictions)

    Dear Statalisters,

    I am trying to analyze an unbalanced panel dataset with appr. 10,000 firm-year observations by running a GEE regression. I am using Stata 17.

    My dependent variable "event_count" is a count variable, which seems to be overdispersed (the variance is much larger than the mean value). Hence, I am specifying a negative binomial family distribution and link function.
    The independent variable is a binary variable and equals either 0 or 1. As you can see in the code below, my data also contains a set of different control variables. Some of those controls are also binary or categorical.

    Code:
    xtset turnover_id fiscal_year
    
    xtgee event_count i.award_win i.inside_ceo age i.ind_div_num i.fiscal_year i.dual_ceo ln_pred_tenure i.ceo_dismissal i.male_ceo ln_ceo_so board_size_0101 pct_ind_directors_0101 pre_succ_roa_indadj ln_assets_tot_0101 i.retained_ceo i.successor_tenure event_count_lag2, corr(ar) family(nbinomial 0.2686892) link(nbinomial) vce(robust)
    The value of 0.2686892 represents the dispersion parameter "α" that was obtained by running nbreg.

    However, after running the command, I receive the error message "estimates diverging (missing predictions) r(430)". I suspect that this might happen due to the large number of categorical variables, but I am not sure about that.

    Interestingly, Stata manages to converge to a solution, when I specify a log link function or a possion distribution.

    Does someone have an idea if (i) there is a way to converge to a solution with a nbinomial link function and (ii) if it is problematic to combine an nbinomial family distribution with a log link function?
    Many thanks in advance!

    Best regards,
    Bono

  • #2
    Negative binomial models can be difficult to fit. I would suggest a Poisson model (log link, Poisson family) with robust standard errors instead. The model is more likely to be estimated and you’ll have consistent point estimates.

    Comment


    • #3
      Dear Leonardo,

      thank you for the reply!
      Indeed, a Poisson model seems more likely to converge to a solution than the negative binomial model. However, I wonder to what extent the specification of a Poisson model is problematic, given the overdispersion of my dependent variable. The mean value of event_count is ~7.32 and the std. dev. is ~11.03. Is it nevertheless reasonable to define a Poisson model (as you suggested)?

      Comment


      • #4
        It’s reasonable to do.

        Comment


        • #5
          Dear Leonardo,
          Could you please elaborate why a Poisson model can be a reasonable approach, despite overdispersion? I will have to justify my methodological approach in detail and therefore would like to understand why it is statistically acceptable to ignore this kind of overdispersion. Many thanks!

          Comment


          • #6
            Think of it as not using any assumptions of the Poisson distribution. It’s just like we use linear regression without a normal distribution. The Poisson quasi-MLEs are completely robust as long as you’re not using a random effects approach. GEE is fine. The Poisson GEE is more robust than NegBin. That latter actually assumes the variance is correctly specified, I think. There’s a way to make it robust. I’ll have to look more closely. No question about the robustness of Poisson GEE. Only requires the mean to be correct.

            Comment


            • #7
              Thank you very much for the input! As far as I understand, the nbinomial model is similar to Poisson models, but incorporates an additional term to address the excess variance. That's why I incorporated the dispersion parameter "α" in the code above.

              Could you specify what you mean by a "correct mean"? Is there a way to check if this requirement is met for the Poisson GEE?

              Comment


              • #8
                You’re estimating a model for E(y|x). All methods you’d apply assume this is correct. The point I was making about the Poisson is that’s all that’s required. Depending on how NegBin is implemented, it may require more.

                Comment


                • #9
                  Jeff Wooldridge Thank you very much for the input! It has helped me to understand the issue better. I gather from the discussion that a Poisson GEE might also be appropriate for my regression problem.

                  Comment


                  • #10
                    You're estimating a model for E(y|x) and so any method assumes that is correctly specified. The point I was making is that you don't need to assume anything else for Poisson regression. It assumes an exponential form, exp(x*b). As usual, you should make the functional form flexible enough (squares, interactions) and you have to argue E(y|x) is interesting to begin with. But that's always the case.

                    The way do "GEE," as a two-step feasible GLS estimator, both Poisson and NegBin would need only the mean to be correctly specified. I'm not sure about NegBin, though, as it might be estimating the variance parameter (alpha) along with the mean parameters. So it might be sensitive to variance misspecification. Poisson is not.

                    Comment

                    Working...
                    X