Hausman test for negative binomial fixed effects and random effects

Alex Maretta

Join Date: Aug 2020

Posts: 9
#1

Hausman test for negative binomial fixed effects and random effects

09 Aug 2020, 10:36

Hello everybody!

I am trying to model the relationship between the number of patents and oil prices. I have a panel count data and I tried to estimate the Poisson random effects model in Stata first and found the evidence of overdispersion.

That's why I proceeded to estimate negative binomial regression. I estimated both fixed and random effects and wanted to compare them by the Hausman test, but got the message that data fails to meet asymptotic assumptions of the Hausman test. Are there any other ways I can compare FE and RE of NGBIN and choose the best model to fit data?

Thank you for answering in advance.

Attached Files
Tags: None
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2199
#2

09 Aug 2020, 13:56

You shouldn't use either one of them. First, you only have 7 groups. RE methods are intended with a large number of groups and relatively few time periods per group. All asymptotics, including for the Hausman test, are justified only as N gets large with T fixed -- hardly a good approximation to your situation.

Second, random effects methods make very strong exogeneity assumptions. The estimates above differ by so much that RE is rejected by eyeball.

Finally, fixed effects NB has severe problems. In my view, it should never be used. I can send you slides I use to teach this method, but it is inference to FE Poisson in every way -- regardless of what you might have been taught or read. Use the fully robust fixed effects Poisson estimator. The problem you have is clustering is not justified for N = 7 and T = 20.

What I would do is just use pooled Poisson regression and include the 6 group dummy variables. You cannot really cluster, but you can make the standard errors robust to violations of the Poisson assumption. If t is your time variable, it is easy to include a linear time trend:

Code:

poisson y x1 x2 ... xK i.id t, vce(robust)

Putting in a full set of time dummies is probably too much. Maybe add c.t#c.t for a quadratic trend.

JW
1 like
Comment
Alex Maretta

Join Date: Aug 2020

Posts: 9
#3

10 Aug 2020, 14:25

Dear Jeff,

Thank you for your answer! I have a few more questions and I would be grateful if you answer the main ones:
It is strange, but I have never seen any info on the fact that RE methods require large #groups and few time periods per group. Could you recommend some articles or books on this thematic so that I can better understand this assumption and reference these sources in the future?

How do we deal with the overdispersion problem in Poisson? By using fully robust standard errors?

What is the logic of including 6 group dummies? Is it because we control for country-specific effects this way?

By the way, I estimated my regression first using FE Poisson and then using Poisson with dummies. The estimation output is completely different for these models: Renewablesshare even has a different sign in both models. Is it because of unjustified clustering, so we should not use the FE Poisson?

And lastly, how do we interpret the fact that for the 2nd and 3rd countries dummies are not significant at 5% significance level?

Sorry for the very long question list, but I am just trying to get it all right and not to make false conclusions. Thank you in advance!

Attached Files
Comment
Alex Maretta

Join Date: Aug 2020

Posts: 9
#4

10 Aug 2020, 14:44

I also wanted to add that Poisson with 6 dummies gives us counterintuitive result regarding the sign of variable called Renewables share in electricity production. What can explain this?
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2199
#5

10 Aug 2020, 20:32

Alex: As per the FAQ, please put your Stata output between code delimiters by clicking on "#" in the above panel. Your output is so blurry I can't read it.

I guess many books on RE methods don't bother with asymptotics. In my MIT Press book, I only study the small T, N getting large case, and I perhaps never explicitly say the statistical properties are essentially unknown with small N.

If you do the Poisson regression with dummies and FE Poisson estimation properly, you will get the same estimates. The standard errors will differ. The "robust" ones reported by xtpoisson are cluster robust and so they cannot be trusted with your small N. That's why I suggested use "poisson" with the vce(robust) option.

The Poisson estimators are robust to any kind of under- or overdispersion for estimating the conditional mean. That's the most you can hope for with your data structure. And, yes, we account for any variance/mean relationship by compute the robust standard errors.

JW
Comment
Mohieddine Rahmouni

Join Date: Aug 2020

Posts: 21
#6

11 Aug 2020, 02:36

Jeff Wooldridge

Dear Jeff,

If you do the Poisson regression with dummies and FE Poisson estimation properly, you will get the same estimates. The standard errors will differ. The "robust" ones reported by xtpoisson are cluster robust and so they cannot be trusted with your small N. That's why I suggested use "poisson" with the vce(robust) option.

I have the same problem with T large and N small. So, I will just use the pooled Poisson regression with dummy variables. I would like to cite your recommendation in my paper. Please could you give me references to cite them in my work.

Thanks,

Mohieddine
Comment
Alex Maretta

Join Date: Aug 2020

Posts: 9
#7

11 Aug 2020, 15:04

Dear Jeff,

Good day once again. Thank you for your answer!
I got the following output using poison with dummies and robust see. I tried to make the output less blurry. I hope, you can see it better now.

I wanted to ask you the following questions:
How to we check whether we have overcome the problem of overdispersion?

How do we interpret the fact that for the 2nd and 3rd countries dummies are not significant at 5% significance level?

What is the minimum number of groups (in my case - countries) to use clustered standard errors? If you could, please, advice some literature on this topic to use it as a reference.

I guess many books on RE methods don't bother with asymptotics. In my MIT Press book, I only study the small T, N getting large case, and I perhaps never explicitly say the statistical properties are essentially unknown with small N.

Is there any way I can state that I cannot use RE because it requires large #groups and few time periods per group in my work? Why is this not written anywhere, so that I cannot make any reference? Is it considered as common knowledge?

Thank you in advance!
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2199
#8

18 Aug 2020, 13:13

The Poisson estimator overcomes the "problem" of overdispersion by ignoring it in estimation -- because it doesn't matter for estimating the mean function -- but fixes the standard errors. You don't have to do anything or test anything.

In a nonlinear model I'd have at least 30 groups before you cluster, and you can't have very many observations per cluster. Christian Hansen has a 2007 Journal of Econometrics paper that discusses the linear case. And RE is a feasible GLS estimator, whose only usable properties are asymptotic -- which means as N increases in this case.

Sometimes variables are insignificant. Why should countries 2 and 3 necessarily differ from country 1? I can't answer that.
Comment
Alex Maretta

Join Date: Aug 2020

Posts: 9
#9

18 Aug 2020, 13:23

Thank you, it is very helpful!

The logic that you explain is clear, but I don't understand the following: every regression (pooled, FE and RE poisson , FE and RE negative binomial) yields the positive regression coefficient for Renewables share. And it seems reasonable, that with the increase in Renewables share in electricity production, the number of patents in clear energy increases. So why does Poisson with dummies give us the illogical sign of this variable?
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2199
#10

18 Aug 2020, 13:43

Some of your statement cannot be correct. Did you always include the year time trend? If so,

Code:

xtset country xtpoisson y x1 ... xK year, fe

and

Code:

poisson y x1 ... xK year i.country

will produce the same betahats. What happens if you include the country dummies in the RE Poisson? I bet something similar.

Controlling for systematic differences across country is crucial. If you don't have country fixed effects the results just aren't believable. If the sign changes, it basically shows you had a spurious result. Unfortunately, that works against you in this case.

What do you think happens if you run a regression of murders on size of the police force at the county level? You'll get a positive relationship because counties with lots of murders typically respond by having larger police forces. One should only exploit variation across time in doing causal analysis. So unit fixed effects and time effects should be included. That you get the "wrong" sign means there's more work to do. But you should also add the year variable to all models if you have not.
Comment

Announcement

Hausman test for negative binomial fixed effects and random effects

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment