Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • 2SRI vs Poisson (IV for non-linear model/NB model)

    Hi, I am seeking help what the best way is to instrument for a NB model.

    My underlying model is negative binomial (as the DV suffers from an overdispersion problem). DV is count (firm births in a spatial unit), Independent vars. are the stock of firms in concentric circles going out from each gridded unit — essentially, I am trying to understand how firm births are driven by other existing firms located nearby.

    I want to instrument for endogeneity as there are likley unobserved things driving the stock of existing firms (e.g.a neighborhood becoming trendy overtime), so I am using historical births to instrument. I also have planning area fixed effects. I am trying to understand what the difference is between the 2SRI (Terza et al., 2008) and ivpoisson function on stata and which is more suitable for my case? I am also wondering as ivpoisson may not be suitable (but have read somewhere else on statalist that someone commented that the Poisson estimator is fully robust to any kind of over or underdisperson?), it is also unable to do fixed effects.

    Can anyone please advise? Thank you! Any help would be very much appreciated.

    Best regards,
    Jasmine

  • #2
    From your description, it sounds like the data are at the spatial-unit level (which is different from planning area), with the dependent variable being the number of firm births in each grid cell. If that is the case, these are not unit fixed effects, but simply planning area dummies, and including them should not be problematic. More generally, there are no issues with unconditional fixed-effects Poisson models, where the fixed effects are included as dummy variables.

    Regarding Poisson: it is true that Poisson quasi–maximum likelihood (QMLE) is often used even when the data are overdispersed. The key point is that the Poisson estimator remains consistent as long as the conditional mean is correctly specified, even if the variance is misspecified (i.e., over- or under-dispersion). One simply needs to compute robust standard errors. In other words, the estimator is robust to variance misspecification, but this does not mean the distributional assumption becomes irrelevant.

    Finally, it is important to distinguish this from the incidental parameters problem (IPP). The IPP arises in unconditional fixed-effects nonlinear models, such as logit with many unit-specific parameters, which is why estimators like conditional logit were developed. Poisson is special because the unconditional fixed-effects Poisson estimator (i.e., Poisson with dummy variables for the fixed effects) does not suffer from the incidental parameters problem. This property does not generally extend to other count models such as the negative binomial.

    Comment


    • #3
      Hi Andrew, Thank you so much for your response, it is very helpful, I think my data has over 110 grid cells per planning area dummy, so the IPP may actually be asymptotically negligible.

      1. Would you advise just using robust standard errors with Poisson?
      2. In either case, do you have advice specific on whether to instrument with control function approach/2SRI or 'ivpoisson' or when you might want to pick one approach over the other? Thank you!!

      Comment


      • #4
        In linear models, IV and the control function approach are equivalent. However, in nonlinear models such as Poisson, the control function approach typically requires additional distributional assumptions (often joint normality of the errors). For that reason, I would generally recommend using the GMM estimator in this setting. And yes, you should use cluster-robust standard errors as you have clustered data (to handle both within-cluster correlation and variance misspecification).
        Last edited by Andrew Musau; 12 Mar 2026, 11:22.

        Comment


        • #5
          Okay, thank you Andrew!

          Comment

          Working...
          X