Correcting standard errors for a fixed effects Poisson model

Mansi Jain

Join Date: Apr 2019

Posts: 44
#1

Correcting standard errors for a fixed effects Poisson model

19 Jul 2019, 20:18

Hi all,

I'm looking at the effect of fire occurrence on the number of visits to National Forests and National Parks. I've divided up these parks into spatial units, with multiple units from each park. This gave me panel date with 8 years and 2,500 IDs, and intend to use fixed effects. My dependent variable is count data and is very over-dispersed. I'm yet to do a formal over dispersion test after accounting for the fixed effects, but at least the unconditional mean is 5.06 and standard dev is 34.01, so I'm presuming the data will remain over dispersed even after accounting for the fixed effects etc. (I will of course check this is the case formally soon).

I've read Cameron and Trivedi's book on count data, and the default approach seems to be doing a Poisson fixed effects model estimated through maximum likelihood and correcting the standard errors. I have a few questions about this:

1) I'm a little unclear about how to correct the standard errors. They indicate at least for cross section data with a Poisson regression that there is a robust sandwich standard error correction that can easily be implemented in Stata. However, I not only want to correct for over dispersion but also i) the fact that an individual unit's observations over time will be correlated, and ii) there is likely some spatial correlation going on, so I should probably adjust for this sort of correlation between units in the same park. I don't know how I can simultaneously correct standard errors to incorporate all these things. I think the book mentioned that there are panel robust standard errors that deal with the first and second, but I don't know how to incorporate the 3rd.

2) It seems like the benefit of a negative binomial model over Poisson is that they can both give consistent estimators, but neg bin is more efficient if data is over dispersed. Consequently, I was thinking of doing a fixed effects negative binomial model. I've read that the process implemented in Stata isn't truly doing fixed effects, and some researchers suggest just doing a normal neg bin with individual dummies for each unit. There seems to be no clear consensus on whether this results in an incidental parameters problem or not, with one paper (Allison and Waterman) indicating it doesn't and is superior to Poisson fixed effects. Does anyone have any take on this?

Thank you so much!
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#2

20 Jul 2019, 04:55

Mansi:
the deeply missed Joe Hilbe, who was for many years a relevant contributor of this forum following textbooks in addition of being a very kind and approachable person off the list, left two comprehensive textbook on the topics you're interested in, that can well complete the other sources you quote (for the future, as reminded by the FAQ, please provide full references. By chance, I know the textbooks/papers you refer to, but others on this forum may ignore them simply because they nevere challenged themselves with count data models. Thanks)
1) https://www.stata.com/bookstore/modeling-count-data/
2) https://www.stata.com/bookstore/nega...al-regression/.

As far as I can remember, cluster robust standard errors correct for apparent overdipersion, whereas -nbreg- is the way to go when you have detected real overdispersion (as it is often the case with -poisson-).
However, you can still use cluster robust standard errors with -nbreg- if you take autocorrelation into account.
You're right that -fe- in -xtpoisson- actually is conditional -fe- (incidental parameter bias, you know...); if you're looking for -fe- in -xtreg- fashion, you should perform a pooled -poisson- with -i.panelid- among predictors.

Kind regards,
Carlo
(Stata 19.0)
3 likes
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2169
#3

20 Jul 2019, 15:25

There are so many problems with the FE NegBin approach that it should probably never be used. When the model was originally proposed by Hausman, Hall, and Griliches (1984, Econometrica), it was thought that it allows two forms of heterogeneity. In my 1999 Journal of Econometrics paper I showed that, in fact, the model collapses to depend on only one heterogeneity parameter.

Below is the list of shortcomings of the NB approach. The Poisson suffers none of them. I showed in my 1999 Journal of Econometrics paper that the Poisson FE estimator is completely robust to every failure of the Poisson assumptions -- except, of course, for having the correct conditional mean. The FENB in the panel data case does not come close to nesting the Poisson assumptions unless the heterogeneity is zero.

Here's a summary of why I tell people to avoid FE NegBin. I hope it helps. There's no need to test for overdispersion. And, with short T panel data, that's not so easy, anyway. And because the Poisson FE is completely robust, you wouldn't do anything if you find it.

1. FE NegBin imposes a very specific overdispersion of the form (1 + c(i)) where the mean effect is c(i). Why this would ever be true is beyond me.
a. There's only one source of heterogeneity.
b. Does not nest the Poisson except in the uninteresting case.
c. The Poisson estimator allows any kind of variance-mean relationship. Some units can be overdispersed, some underdispersed. The same unit can exhibit both depending on the covariate values.
2. FE NegBin imposes conditional independence.
a. Serial correlation is not allowed.
b. The Poisson FE allows any kind of serial correlation. One just needs to cluster the standard errors.
3. FE NegBin is not known to be robust to failure of any of its assumptions.
4. Time constant variables do not drop out in the FENB estimation!
a. This is what people mean when they say it's not a "true" FE procedure.
5. The actual estimation of the FENB model often fails to converge, very likely because of the weird overdispersion it requires for every unit i in the cross section.
7 likes
Comment
Mansi Jain

Join Date: Apr 2019

Posts: 44
#4

22 Jul 2019, 09:41

Dear Carlo Lazzaro and Jeff Wooldridge,

Thank you so much for your help and advice! I really appreciate it. What I'm gathering from this discussion is that I don't even need to test for over dispersion - FE NegBin is definitely not the way to go, and as long as I cluster the standard errors, FE Poisson should work just fine.

I think I'm still a little unsure on these three fronts:

(i) Carlo Lazzaro, you mentioned that xtpoisson is a conditional fixed effects model rather than a pure fixed effects model. What is the difference between the two? I'm not able to find helpful references to explain that at a glance.

(ii) I think I'm still uncertain about how to correct standard errors in the FE Poisson model. I see that I can use vce (cluster park_unit) to cluster standard errors at the park level to account for some level of autocorrelation between units in the same park. However, I can't do vce(robust) at the same time. I'm wondering how I can both have standard errors that are robust to misspecification (such as presumed equality of conditional mean and variance) and clustered at the appropriate level. In the worse case scenario, I would at least want them to be robust to misspecification and clustered at the unit level (so that observations from the same unit across time are not assumed to be independent). Perhaps the vce (robust) option in a panel data model like xtpoisson is already clustering at the unit level, but I'm not sure.

(iii) For whatever reason, the AIC and BIC for an xtnbreg model I just tried to see what would happen is lower than the xtpoisson model. Given the problems Jeff Wooldridge mentioned above, should I ignore this entirely, or does this deserve some explanation/ attention?

Thank you so much! I'm very grateful for your support.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#5

22 Jul 2019, 09:45

Mansi:
among many other entries on incidental parameter bias, the following one (and related references) can be helpful: http://methods.johndavidpoe.com/2016...rameters-bias/

-cluster()-ing standard errors at park level does not necessarily take correlation within observations belonging to the same panel into account.
That said, if your panel units are actually clustered within parks, you may want to consider a mixed model design (see -help meglm-).

Last edited by Carlo Lazzaro; 22 Jul 2019, 09:51.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Mansi Jain

Join Date: Apr 2019

Posts: 44
#6

22 Jul 2019, 16:20

Dear Carlo Lazzaro,

Thank you so much! The link was helpful in understanding what is going on. I'll also look into using a mixed model to take into account the spatial correlation.

Related to point (2) about clustering standard errors, I actually read some sources online that suggested that the vce(robust) option for xtpoisson actually clusters at the unit level by default, so those errors are corrected for misspecification as well as correlation within observations belonging to the same panel. I also read some other sources that suggested that doing vce(cluster unit_id) actually not only clusters standard errors at the unit level but is also robust to misspecification. Lastly, I read something that suggested using xtpqml instead of xtpoisson to get clustered errors that are also robust to problems like overdispersion.

Stata documentation for the vce types above isn't very helpful, so I would love any help I can get to figure out which of the above are true.

Thank you so much!
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2169
#7

22 Jul 2019, 20:45

Mansi: Here are some more thoughts. It appears you don't have panel data but grouped data. Is that correct? It doesn't change much. (Edit: I reread your initial post and see that you do have true panel data, possibly in addition to clusters at the park level.)

1. If you want to cluster at the level that you are allowing fixed effects, vce(robust) and using the cluster(unit_id) with xtpqml should deliver essentially the same answer. They are both robust to any misspecification of the Poisson distribution and within-cluster correlation.
2. xtpqml has one advantage: it allows clustering at a higher level than the fixed effects. So, if you think you have data clustered at, say, the school level, but FEs are included at the student level, you can cluster at the school level. I'm not sure why Stata 15 did not allow this. (I haven't checked with Stata 16).
3. I wouldn't recommend a multilevel model because then you'll be back to assuming the covariates are independent of the heterogeneity. It is not worth the tradeoff. It's possible to allow both, but it's pretty advanced and not done by Stata (as far as I know). Instead, use the Poisson FE and cluster at the appropriate level.
4. It's not very informative to compare measures of fit based on log likelihoods. Of course the NegBin could fit better than the Poisson if you are to choose a full distributional specification. But it's very likely that both are wrong because of the very precise variance/mean relationship and the serial independence. That's why you use FE Poisson because, for estimating the conditional mean, you have full robustness. This is a common mistake. In the cross-sectional case, NegBin effectively adds a parameter. So it will fit better than the Poisson in terms of log likelihoods. But the Poisson is fully robust to any distributional misspecification.

Last edited by Jeff Wooldridge; 22 Jul 2019, 21:05.
2 likes
Comment
Long Hong

Join Date: Oct 2015

Posts: 68
#8

22 Jul 2019, 20:53

Hi Mansi: for your research questions to be answered, I think you can first focus on the linear model with fixed effects. Is there a specific reason why you do not want to use a linear model? (e.g. is misspecification a first-order issue in your estimation?) Hope it helps.
Comment
Mansi Jain

Join Date: Apr 2019

Posts: 44
#9

23 Jul 2019, 11:10

Dear Jeff Wooldridge,

Thank you so much for your advice! That's very helpful. I hadn't quite understood the mechanics of how AIC/ BIC are measuring fit and to what extent that translates to unbiased coefficients, so I really appreciate the clarity your comment gave me. What I'm gathering from this discussion is that I can use xtpoisson if I decide to cluster just at the unit level and xtpqml if I want to cluster at the park level. One last clarifying question if I use xtpqml: the standard errors should be 'robust' irrespective of whether I cluster at the level of the fixed effects or a higher level, right? I'm guessing this is the case but just want to be sure; the xtpqml documentation isn't very helpful on this front.

Thank you so much again!

Regards,
Mansi
Comment
Mansi Jain

Join Date: Apr 2019

Posts: 44
#10

23 Jul 2019, 11:16

Dear Long Hong,

Thank you for your question! I chose to use a count data model because my dependent variable can only be a non-negative integer and the linear model doesn't take into account the limited support for this variable. I'm also estimating the linear fixed effects model, but I think a Poisson model makes more sense for recreation count data.

Regards,
Mansi
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#11

23 Jul 2019, 12:02

Mansi:
another possible shortcoming of using OLS with count dependent variables is that the predicted values can be negative (which does not make sense for a regressand that can take on integer, non negative values only).

Kind regards,
Carlo
(Stata 19.0)
Comment
Long Hong

Join Date: Oct 2015

Posts: 68
#12

23 Jul 2019, 13:42

Hi Mansi, I see your point. My understanding from reading your research question is more about program evaluation instead of prediction. If it is about prediction, then you do care about the underlying Data Generating Process as well as whether the prediction gives you a negative value.

A simple linear model could buy you many things, e.g (1) nice interpretation of the coefficients, (2) simple standard errors, (3) most importantly answer your research question in a way that is not too different from a Poisson model if your sample size is big enough. I personally think a linear model is a good first step to answer your research question, and a Poisson model can be a nice robustness check. This is a tradeoff you may have to choose, because, as discussed above, a Poisson FE model does not seem to be an easy job.

Hope it helps.
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2169
#13

24 Jul 2019, 04:03

Generally, I agree that estimating a linear model is a good starting point. And for cases like binary or corner solution responses, there's a clear tradeoff in assumptions because, with small T, one usually must use a correlated random effects approach or do a bias adjustment to the dummy variable approach. But with a count outcome (or nonnegative, unbounded outcomes generally), Poisson FE has significant advantages. It simply replaces the linear functional form for the mean with an exponential functional form. No other assumptions are needed, just like in the linear case. The coefficients in the exponential model are easy to interpret because they have percentage effects. Estimation is straightforward because the Poisson FE quasi-log likelihood is concave.

If the linear model estimated by FE and the exponential estimated by Poisson FE give qualitatively different estimates I would trust the latter. I've seen cases where the linear model gives coefficients of the opposite and counterintuitive sign, in a statistically significant way. Even for program evaluation I would go with Poisson FE. I'd use Poisson in the cross sectional case with a nonnegative outcome, too, and exploit doubly robust estimation.
1 like
Comment
Mansi Jain

Join Date: Apr 2019

Posts: 44
#14

04 Aug 2019, 22:05

Dear Jeff Wooldridge,

Thank you so much for this explanation! That's really helpful -- particularly the point about how to interpret the results if linear and Poisson models give different results.

P.S: Your work has been really important in learning econometrics for me. Thank you for all you do!
Comment
Mansi Jain

Join Date: Apr 2019

Posts: 44
#15

05 Aug 2019, 13:21

Dear Jeff Wooldridge,

Actually, one last question. You mentioned in one post that the Poisson model is robust to the failure of every Poisson assumption except the correct specification of the conditional mean. I think I'm a little confused about what correct specification of the conditional mean would mean. How can I test whether or not that is the case?

Thank you!
Mansi
Comment

Announcement

Correcting standard errors for a fixed effects Poisson model

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment