I am deciding between a panel regression with fixed effects or random effects. The outcome variable here is labor force participation rate "lfp_rate", and one independent variable is poverty rates "pov_". I have data between 2011-2019, but data completeness varies by both country and year.
dataex countrycode lfp_rate pov_ year ----------------------- copy starting from the next line ----------------------- [CODE] * Example generated by -dataex-. To install: ssc install dataex clear input str3 countrycode double lfp_rate float pov_ int year "ARG" 59.96 1.4 2013 "ARG" 59.6 1.1 2011 "ARG" 58.81 1.1 2019 "ARG" 58.81 .9 2017 "ARG" 58.81 1.5 2018 "ARG" 60.29 1 2012 "ARG" 60.36 1.2 2015 "ARG" 60.65 .9 2016 "ARG" 59.44 . 2014 "ARM" 59.39 1.1 2013 "ARM" 63.02 1.4 2017 "ARM" 60.82 .9 2019 "ARM" 59.42 .9 2014 "ARM" 62.29 1.5 2015 "ARM" 59.67 1.3 2018 "ARM" 54.11 1.8 2012 "ARM" 60.72 1.2 2016 "ARM" 58.65 1.2 2011
To decide between the two type panel regression methods, I followed instructions here
where I ran both first a fixed effects regression:
```
xtreg lfp_rate pov_ , fe
estimates store fixed
Fixed-effects (within) regression Number of obs = 571
Group variable: numeric_co~e Number of groups = 124
R-sq: Obs per group:
within = 0.0136 min = 1
between = 0.0766 avg = 4.6
overall = 0.0138 max = 9
F(1,446) = 6.14
corr(u_i, Xb) = -0.4866 Prob > F = 0.0136
lfp_rate Coef. Std. Err. t P>t [95% Conf. Interval]
pov_ -.3172859 .1280625 -2.48 0.014 -.5689667 -.0656051
_cons 62.67351 .6149928 101.91 0.000 61.46486 63.88215
sigma_u 13.733666
sigma_e 3.1857261
rho .94893979 (fraction of variance due to u_i)
F test that all u_i=0: F(123, 446) = 31.33 Prob > F = 0.0000
```
And then ran a random effects model:
```
// Random effecs
xtreg lfp_rate pov_, re
estimates store random
Random-effects GLS regression Number of obs = 571
Group variable: numeric_co~e Number of groups = 124
R-sq: Obs per group:
within = 0.0136 min = 1
between = 0.0766 avg = 4.6
overall = 0.0138 max = 9
Wald chi2(1) = 3.70
corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0544
lfp_rate Coef. Std. Err. z P>z [95% Conf. Interval]
pov_ .0917747 .0477002 1.92 0.054 -.0017161 .1852654
_cons 59.40576 1.089927 54.50 0.000 57.26954 61.54198
sigma_u 10.228442
sigma_e 3.1857261
rho .91157217 (fraction of variance due to u_i)
```
Then ran a Hausman test to decide between the two models:
```
hausman fixed random
Coefficients ----
(b) (B) (b-B) sqrt(diag(V_b-V_B))
fixed random Difference S.E.
pov_ -.3172859 .0917747 -.4090606 .1188473
b = consistent under Ho and Ha; obtained from xtreg
B = inconsistent under Ha, efficient under Ho; obtained from xtreg
Test: Ho: difference in coefficients not systematic
chi2(1) = (b-B)'[(V_b-V_B)^(-1)](b-B)
= 11.85
Prob>chi2 = 0.0006
```
Based on my understanding, I should use a fixed effects regression model if Prob>chi2 is statistically significant at .05, and given that the result is Prob>chi2 = 0.0006, then I should use an FE model?
Are there other checks I can run to assess whether to proceed with an FE or RE model?
dataex countrycode lfp_rate pov_ year ----------------------- copy starting from the next line ----------------------- [CODE] * Example generated by -dataex-. To install: ssc install dataex clear input str3 countrycode double lfp_rate float pov_ int year "ARG" 59.96 1.4 2013 "ARG" 59.6 1.1 2011 "ARG" 58.81 1.1 2019 "ARG" 58.81 .9 2017 "ARG" 58.81 1.5 2018 "ARG" 60.29 1 2012 "ARG" 60.36 1.2 2015 "ARG" 60.65 .9 2016 "ARG" 59.44 . 2014 "ARM" 59.39 1.1 2013 "ARM" 63.02 1.4 2017 "ARM" 60.82 .9 2019 "ARM" 59.42 .9 2014 "ARM" 62.29 1.5 2015 "ARM" 59.67 1.3 2018 "ARM" 54.11 1.8 2012 "ARM" 60.72 1.2 2016 "ARM" 58.65 1.2 2011
To decide between the two type panel regression methods, I followed instructions here
where I ran both first a fixed effects regression:
```
xtreg lfp_rate pov_ , fe
estimates store fixed
Fixed-effects (within) regression Number of obs = 571
Group variable: numeric_co~e Number of groups = 124
R-sq: Obs per group:
within = 0.0136 min = 1
between = 0.0766 avg = 4.6
overall = 0.0138 max = 9
F(1,446) = 6.14
corr(u_i, Xb) = -0.4866 Prob > F = 0.0136
lfp_rate Coef. Std. Err. t P>t [95% Conf. Interval]
pov_ -.3172859 .1280625 -2.48 0.014 -.5689667 -.0656051
_cons 62.67351 .6149928 101.91 0.000 61.46486 63.88215
sigma_u 13.733666
sigma_e 3.1857261
rho .94893979 (fraction of variance due to u_i)
F test that all u_i=0: F(123, 446) = 31.33 Prob > F = 0.0000
```
And then ran a random effects model:
```
// Random effecs
xtreg lfp_rate pov_, re
estimates store random
Random-effects GLS regression Number of obs = 571
Group variable: numeric_co~e Number of groups = 124
R-sq: Obs per group:
within = 0.0136 min = 1
between = 0.0766 avg = 4.6
overall = 0.0138 max = 9
Wald chi2(1) = 3.70
corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0544
lfp_rate Coef. Std. Err. z P>z [95% Conf. Interval]
pov_ .0917747 .0477002 1.92 0.054 -.0017161 .1852654
_cons 59.40576 1.089927 54.50 0.000 57.26954 61.54198
sigma_u 10.228442
sigma_e 3.1857261
rho .91157217 (fraction of variance due to u_i)
```
Then ran a Hausman test to decide between the two models:
```
hausman fixed random
Coefficients ----
(b) (B) (b-B) sqrt(diag(V_b-V_B))
fixed random Difference S.E.
pov_ -.3172859 .0917747 -.4090606 .1188473
b = consistent under Ho and Ha; obtained from xtreg
B = inconsistent under Ha, efficient under Ho; obtained from xtreg
Test: Ho: difference in coefficients not systematic
chi2(1) = (b-B)'[(V_b-V_B)^(-1)](b-B)
= 11.85
Prob>chi2 = 0.0006
```
Based on my understanding, I should use a fixed effects regression model if Prob>chi2 is statistically significant at .05, and given that the result is Prob>chi2 = 0.0006, then I should use an FE model?
Are there other checks I can run to assess whether to proceed with an FE or RE model?
Comment