Fixed effects model with high rho and negative corr(u_i, Xb): a problem?

Bruno Hoepers

Join Date: May 2016

Posts: 125
#1

Fixed effects model with high rho and negative corr(u_i, Xb): a problem?

25 Jan 2020, 17:47

I'm running some regression models with panel data from Brazilian municipalities (N = 64,708, Number of groups = 5,561, T = 3).

The results seem ok at first, but:

1. The overall R2 is almost null: 0.0008

2. rho = .9917

3. corr(u_i, Xb) = -0.9928

What do these pieces of information suggest about the convenience of running FE models versus running pooled OLS?
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#2

25 Jan 2020, 18:15

What these statistics say to me is that your regression model provides almost no information about your data: you could really do just as well by predicting the municipality mean for every observation on that municipality; all of the outcome variation is between municipalities with almost none within-municipalities. Running pooled OLS would not make much sense here in my view because all the "action" is at the municipality level and nothing is going on within municipalities over time. So if I were going to try a different model, I would think of -xtreg, be-.

Finally, just in general, -xtreg- models are really meant to be used with large T. For T = 3, -xtgls- might be more appropriate.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#3

26 Jan 2020, 01:41

Bruno:
-you should report within R-square;
-you do not report the value of the F-test that comes as a footnote under the -xtreg,fe- outcome table;
- provided that I did not missing something, once in many years I respectfully disagree with Clyde, as -xtgls- is conceived for short N, long T panel datasets, which does not seem to be the case here.

Kind regards,
Carlo
(Stata 19.0)
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#4

26 Jan 2020, 09:43

Sorry, Carlo is right. My error.
Comment
Bruno Hoepers

Join Date: May 2016

Posts: 125
#5

26 Jan 2020, 10:35

Carlo: the information you mentioned follows below:

R2 within = 0.0044
R2 between = 0.0010
R2 overall = 0.0008

F-test: F(5560, 59125) = 16.94 , Prob > F = 0.0000
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#6

26 Jan 2020, 11:15

Bruno:
- the R2 within is really low: I would carefully scrutiny your model in search of possible misspecification, although the most likely reason for that is the one that Clyde highlighted: a scant variation within the same panel as time goes by. In sum: do a thorough check of your model;
- the F-test reach seems to points you toward -xtreg,fe- vs pooled OLS;
- you do not report any detail concerning your regression code and your -xtreg,fe- outcome table;
- you do not specify whether or not you compared the -fe- with the -re- specification (via -hausman- assuming that you used default standard errors).

As an aside, please take a look at the FAQ on how to post more effectively: going back and forth with the same post adding one piece of information at time is wasting everybody's time and, in all likelihood, does not give you informative suggestions on how to tackle the issue.

Kind regards,
Carlo
(Stata 19.0)
Comment
Bruno Hoepers

Join Date: May 2016

Posts: 125
#7

26 Jan 2020, 12:15

Hi Carlo. I'm sorry for not providing all the information. I really thought that what I provided was enough. My fault.

Three things:

- The result of the Hausman test: chi2(10) = 168.96 , Prob>chi2 = 0.0000 [FE over RE]

- The result's table for the FE model follows below.

- One important correction: I made a mistake about T in my data. It is 12 rather than 3. I have another dataset from Brazilian municipalities whose T = 3. Because I go back and forth working with these datasets I made the mistake.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#8

26 Jan 2020, 14:12

I would just add one more remark to Carlo's good advice. When there is such extremely low within-panel variation over time, it makes sense to re-check your data. It is easy to make errors in data management that lead to this situation.
2 likes
Comment
Bruno Hoepers

Join Date: May 2016

Posts: 125
#9

26 Jan 2020, 14:41

Thanks Clyde. I will take that into consideration and re-check the data.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#10

26 Jan 2020, 23:40

Bruno:
have you already checked for the possible existence of squared relationships between predictors and regressand in your regression model?
Due to the relevant time dimension of your dataset (12 years), have you already ruled out error autocorrelation?

Kind regards,
Carlo
(Stata 19.0)
Comment
Bruno Hoepers

Join Date: May 2016

Posts: 125
#11

27 Jan 2020, 11:07

Carlo:

I did not check for squared relationships.

Concerning autocorrelation, I ran a Portmanteau test for panel serial correlation "xtistest" and got the following:
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#12

27 Jan 2020, 14:24

Bruno:
invoke clustered robust standard errors then.

Kind regards,
Carlo
(Stata 19.0)
Comment
Bruno Hoepers

Join Date: May 2016

Posts: 125
#13

27 Jan 2020, 15:07

I will. Thanks a lot.
Comment

Announcement

Fixed effects model with high rho and negative corr(u_i, Xb): a problem?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment