Dear Statalist,
I am creating a model for my undergraduate thesis which I have picked up after a longer pause. I am now stuck in choosing between the FE and RE estimation methods for analysing my panel data.
The data set includes data for 8 variables over 17 years for 47 countries. I have transformed the dependent variable to logaritmic scale to adress issues with heteroskedasticity, and seem to have solved it as per the BP-test (hettest):
. hettest
Breusch–Pagan/Cook–Weisberg test for heteroskedasticity
Assumption: Normal error terms
Variable: Fitted values of log_turistbesök
H0: Constant variance
chi2(1) = 0.11
Prob > chi2 = 0.7355
Further, I have completed a LM-test using xttest0. From my understanding, a rejected null-hypothesis should indicate that POLS is not appropriate. My test results show that we reject H0:
. xttest0
Breusch and Pagan Lagrangian multiplier test for random effects
log_turistbesök[land_id,t] = Xb + u[land_id] + e[land_id,t]
Estimated results:
| Var SD = sqrt(Var)
---------+-----------------------------
log_tur~k | 2.393221 1.547004
e | .2057564 .4536038
u | 1.609991 1.268854
Test: Var(u) = 0
chibar2(01) = 842.29
Prob > chibar2 = 0.0000
From this I have moved on to a Hausman test by using hausman fe re. From my results, I conclude that we do not reject H0, and conclude that the differrence in RE vs FE coefficients do not seem systematic:
.hausman fe re
Test of H0: Difference in coefficients not systematic
chi2(7) = (b-B)'[(V_b-V_B)^(-1)](b-B)
= 9.07
Prob > chi2 = 0.2479
(V_b-V_B is not positive definite)
With this information, I have started leaning towards using an RE-estimation method on my data. This would assume that the sample is randomly selected from my population. The data in its original state is not randomly selected, as it includes all avaliable individuals and years. I have therefore tried to create a random sample using sample 50. After the creation of this data set, I have ran the same tests again, and while the stats are different, the conclusions regarding the rejections of H0s remain the same for all tests.
During my courses in statistics, teachers have been clear on the strict conditions of RE estimations, and that the method often is inappropriate to use. This makes me uncertain and unconfortable in my conclusions. Have I taken enough actions to prove that RE is the best fit model for my data, or am I missing some assumption or test that could be crucial for the model-fit? What more can I do to conclude the best fitted method to my analysis?
Thank you to anyone willing to assist in this issue,
WG
I am creating a model for my undergraduate thesis which I have picked up after a longer pause. I am now stuck in choosing between the FE and RE estimation methods for analysing my panel data.
The data set includes data for 8 variables over 17 years for 47 countries. I have transformed the dependent variable to logaritmic scale to adress issues with heteroskedasticity, and seem to have solved it as per the BP-test (hettest):
. hettest
Breusch–Pagan/Cook–Weisberg test for heteroskedasticity
Assumption: Normal error terms
Variable: Fitted values of log_turistbesök
H0: Constant variance
chi2(1) = 0.11
Prob > chi2 = 0.7355
Further, I have completed a LM-test using xttest0. From my understanding, a rejected null-hypothesis should indicate that POLS is not appropriate. My test results show that we reject H0:
. xttest0
Breusch and Pagan Lagrangian multiplier test for random effects
log_turistbesök[land_id,t] = Xb + u[land_id] + e[land_id,t]
Estimated results:
| Var SD = sqrt(Var)
---------+-----------------------------
log_tur~k | 2.393221 1.547004
e | .2057564 .4536038
u | 1.609991 1.268854
Test: Var(u) = 0
chibar2(01) = 842.29
Prob > chibar2 = 0.0000
From this I have moved on to a Hausman test by using hausman fe re. From my results, I conclude that we do not reject H0, and conclude that the differrence in RE vs FE coefficients do not seem systematic:
.hausman fe re
Test of H0: Difference in coefficients not systematic
chi2(7) = (b-B)'[(V_b-V_B)^(-1)](b-B)
= 9.07
Prob > chi2 = 0.2479
(V_b-V_B is not positive definite)
With this information, I have started leaning towards using an RE-estimation method on my data. This would assume that the sample is randomly selected from my population. The data in its original state is not randomly selected, as it includes all avaliable individuals and years. I have therefore tried to create a random sample using sample 50. After the creation of this data set, I have ran the same tests again, and while the stats are different, the conclusions regarding the rejections of H0s remain the same for all tests.
During my courses in statistics, teachers have been clear on the strict conditions of RE estimations, and that the method often is inappropriate to use. This makes me uncertain and unconfortable in my conclusions. Have I taken enough actions to prove that RE is the best fit model for my data, or am I missing some assumption or test that could be crucial for the model-fit? What more can I do to conclude the best fitted method to my analysis?
Thank you to anyone willing to assist in this issue,
WG
Comment