Dear Statalist
I am new to STATA and looking for your help regarding panel analysis.
I have a balanced panel data from 2004 to 2015 (12 years, no missing value) and my analysis equation is as below:
Q = a*GDP_int + bX1 + cX2 + dX3 + e
Where
Q: quality of life (Human Development Index) of country i in year t
GDP_int: GDP per capita in year 2004
X1, X2, X3: explanatory variables that affect Q
My first question is how to conduct a panel analysis by grouping 3 periods.
I want to conduct panel analysis twice with this same model above:
1) using the annual data
2) using average data of 4 year average data over 3 period (2004-2007, 2008-2011, and 2012-2015)
For doing 1)
I wrote a command as below.
For doing 2)
I created "period" variable and average variables for all explanatory variables by the period.
And tried to re- define the data set as a panel data which has a time variable "period" as below.
But then I got an error message "repeated time values within panel".
so I just ran the panel analysis as below, without defining the data set again.
I want to know if it is okay to run the analysis twice like this, without re-defining the data set with "period".
The output using the average variables showed stronger significance, and I am worried if this is because of duplication.
(For example, for id 1 in period 1, the value of aX1 is same between 2004 and 2007)
My second problem is that when I conduct the 1) and 2) analysis above with fixed effect option,
STATA returns the a message saying "note: GDP_int omitted because of collinearity".
It does not happen with the random effect model.
I understand it is because "GDP_int" is a non-time-varying variable.
My question is whether it is okay to run hausman test in this situation,
where the coffecient for "GDP_int" is zero (omitted) in fixed effect model, but measured in random effect model.
If it is not okay, what would be the solution?
Do I have to drop this term in my equation if I want to use fixed effect model?
It will be a great help if someone can answer any of these questions.
Thank you very much.
I am new to STATA and looking for your help regarding panel analysis.
I have a balanced panel data from 2004 to 2015 (12 years, no missing value) and my analysis equation is as below:
Q = a*GDP_int + bX1 + cX2 + dX3 + e
Where
Q: quality of life (Human Development Index) of country i in year t
GDP_int: GDP per capita in year 2004
X1, X2, X3: explanatory variables that affect Q
My first question is how to conduct a panel analysis by grouping 3 periods.
I want to conduct panel analysis twice with this same model above:
1) using the annual data
2) using average data of 4 year average data over 3 period (2004-2007, 2008-2011, and 2012-2015)
For doing 1)
I wrote a command as below.
Code:
xtset id year panel variable: id (strongly balanced) time variable: year, 2004 to 2015 delta: 1 unit xtreg Q GDP_int X1 X2 X3, fe xtreg Q GDP_int X1 X2 X3, re
I created "period" variable and average variables for all explanatory variables by the period.
And tried to re- define the data set as a panel data which has a time variable "period" as below.
Code:
gen period=1 // year 2004-2007 replace period=2 if year >=2008 & year <2012 replace period=3 if year >=2012 bysort code period : egen aQ = mean(Q) bysort code period : egen aX1 = mean(X1) bysort code period : egen aX2 = mean(X2) bysort code period : egen aX3 = mean(X3) xtset id period
so I just ran the panel analysis as below, without defining the data set again.
Code:
xtreg aQ GDP_int aX1 aX2 aX3, fe xtreg aQ GDP_int aX1 aX2 aX3, re
The output using the average variables showed stronger significance, and I am worried if this is because of duplication.
(For example, for id 1 in period 1, the value of aX1 is same between 2004 and 2007)
My second problem is that when I conduct the 1) and 2) analysis above with fixed effect option,
STATA returns the a message saying "note: GDP_int omitted because of collinearity".
It does not happen with the random effect model.
I understand it is because "GDP_int" is a non-time-varying variable.
My question is whether it is okay to run hausman test in this situation,
where the coffecient for "GDP_int" is zero (omitted) in fixed effect model, but measured in random effect model.
If it is not okay, what would be the solution?
Do I have to drop this term in my equation if I want to use fixed effect model?
It will be a great help if someone can answer any of these questions.
Thank you very much.
Comment