Panel analysis over 4-year average data (fe, re, hausman)

Jae Kim

Join Date: Jul 2018

Posts: 6
#1

Panel analysis over 4-year average data (fe, re, hausman)

01 Jul 2018, 08:16

Dear Statalist

I am new to STATA and looking for your help regarding panel analysis.

I have a balanced panel data from 2004 to 2015 (12 years, no missing value) and my analysis equation is as below:

Q = a*GDP_int + bX1 + cX2 + dX3 + e

Where
Q: quality of life (Human Development Index) of country i in year t
GDP_int: GDP per capita in year 2004
X1, X2, X3: explanatory variables that affect Q

My first question is how to conduct a panel analysis by grouping 3 periods.
I want to conduct panel analysis twice with this same model above:
1) using the annual data
2) using average data of 4 year average data over 3 period (2004-2007, 2008-2011, and 2012-2015)

For doing 1)
I wrote a command as below.

Code:

xtset id year panel variable: id (strongly balanced) time variable: year, 2004 to 2015 delta: 1 unit xtreg Q GDP_int X1 X2 X3, fe xtreg Q GDP_int X1 X2 X3, re

For doing 2)
I created "period" variable and average variables for all explanatory variables by the period.
And tried to re- define the data set as a panel data which has a time variable "period" as below.

Code:

gen period=1 // year 2004-2007 replace period=2 if year >=2008 & year <2012 replace period=3 if year >=2012 bysort code period : egen aQ = mean(Q) bysort code period : egen aX1 = mean(X1) bysort code period : egen aX2 = mean(X2) bysort code period : egen aX3 = mean(X3) xtset id period

But then I got an error message "repeated time values within panel".
so I just ran the panel analysis as below, without defining the data set again.

Code:

xtreg aQ GDP_int aX1 aX2 aX3, fe xtreg aQ GDP_int aX1 aX2 aX3, re

I want to know if it is okay to run the analysis twice like this, without re-defining the data set with "period".
The output using the average variables showed stronger significance, and I am worried if this is because of duplication.
(For example, for id 1 in period 1, the value of aX1 is same between 2004 and 2007)

My second problem is that when I conduct the 1) and 2) analysis above with fixed effect option,
STATA returns the a message saying "note: GDP_int omitted because of collinearity".
It does not happen with the random effect model.

I understand it is because "GDP_int" is a non-time-varying variable.
My question is whether it is okay to run hausman test in this situation,
where the coffecient for "GDP_int" is zero (omitted) in fixed effect model, but measured in random effect model.
If it is not okay, what would be the solution?
Do I have to drop this term in my equation if I want to use fixed effect model?

It will be a great help if someone can answer any of these questions.
Thank you very much.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30066
#2

01 Jul 2018, 10:22

I am confused about the panel structure in your data. Your xtset command suggests that the panels are identified by the variable id. But when you computer your period-average values you use a different variable called code. What's this about?

To do your analysis based on period-average data, you need to reduce the data set to a single observation per period. So, after you create the period variable, instead of the block of -egen- statements you used, do this:

Code:

collapse (mean) Q X1 X2 X3, by(code period) xtset code period

Stata will not complain now and your data will be properly structured for

Code:

xtreg Q X1 X2 X3

with either fe or re.

Regarding your second question, the issue is whether or not you need an estimate of the effect of GDP_int in order to accomplish your research goals. If you do, then it is simply not possible to do so in a fixed-effects model and no Hausman test can prevail over linear algebra. If you do not need an estimate of the effect of GDP_int, then you should simply drop it from your model altogether. In that situation, the Hausman test on the model without GDP_int might be used to guide your choice between -fe- and -re-.
1 like
Comment
Jae Kim

Join Date: Jul 2018

Posts: 6
#3

01 Jul 2018, 22:55

Dear Clyde Schechter

Thank you for your clear answer.
Regarding my code, variable "id" is same as variable "code". Because "code" is a string variable I created "id" which is not string.
Sorry for confusing you and thank you for your help again! :-)
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30066
#4

01 Jul 2018, 23:13

OK. In that case, the -xtset- command will have to use id, not code, as the panel variable, because -xtset- does not allow string variables.
Comment

Announcement

Panel analysis over 4-year average data (fe, re, hausman)

Comment

Comment

Comment