Hi. I am currently working on my dissertation project. I have data of 29 provinces and 11 years. I have 6 independent variables (no categorical variable at all). Initially, I found that my Fixed Effect model has heteroskedasticity and autocorrelation issues. I tried to use the robust command. The VIFs result after the xtreg robust are really high. However, after going through some discussion in the forum, I found out that with only 29 provinces, I should not use robust command and should not use Hausman test if I used robust command. I also read some advice to perform linktest by hand, but I still don't understand how it works. The reason why I used ln for several variables is because the histograms showed that the distribustions are not normal. I am also not sure if I should add i.year in my FE syntax. So I wonder if there is any advice for my problem.
These are what I have been doing:
1. I used corr command to see if there's any correlation higher than 0.75. The result suggested no high correlation
2. I used reg command then vif to check for multicollinearity, all vifs are below 10
3. I used fixed effect without robust command (Before I checked for heteroskedasticity, I am quite happy with the result since I have 2 significant variables)
4. I used random effect without robust command
5. Hausman test (result indicated fixed effect is more appropriate)
6. Heteroskedasticity
7. Autocorrelation
8. I used vce corr to diagnose multicollinearity
9. I also checked the VIF after xtreg
10. I also used xtoverid command
Additionally, I also performed Joint F-test and the result indicated that Fixed effect is indeed needed. LM test also indicated that Random effect is needed as well. So I figured I should go with Fixed effect.
I apologize for the long question. In summary, my problems are heteroskedasticity, autocorrelation, indication of multicollinearity (although I am not sure since the initial corr result doesn't show any high correlation between independent variables), and insignificant variables. I have tried many variables combinations and the results are the same: insignificant variables and heteroskedasticity etc.
Thank you so much for your time and assistance.
Code:
. egen prov_id = group (province) . xtset prov_id year Panel variable: prov_id (strongly balanced) Time variable: year, 2010 to 2020 Delta: 1 unit
Code:
. * data summary * . xtsum pop65 lntfr lngdpcap lifeexp lnpopurban meanyearsedu fpaidemployed Variable | Mean Std. dev. Min Max | Observations -----------------+--------------------------------------------+---------------- pop65 overall | 5.286301 1.833573 1.54 10.5 | N = 319 between | 1.80598 1.931818 10.42727 | n = 29 within | .4505427 3.736301 7.326301 | T = 11 | | lntfr overall | .9595275 .1336687 .6678294 1.302913 | N = 319 between | .1239804 .7653753 1.233511 | n = 29 within | .0545853 .7779375 1.109783 | T = 11 | | lngdpcap overall | 10.30609 .5308984 9.139573 12.07147 | N = 319 between | .5233785 9.312996 11.86483 | n = 29 within | .1286158 9.908724 10.74314 | T = 11 | | lifeexp overall | 69.42608 2.436952 63.82 74.99 | N = 319 between | 2.425443 65.12727 74.6 | n = 29 within | .4908727 68.05699 70.8479 | T = 11 | | lnpopu~n overall | 3.662959 .3840131 2.827314 4.60517 | N = 319 between | .3872141 2.923856 4.60517 | n = 29 within | .047395 3.434371 3.832794 | T = 11 | | meanye~u overall | 8.412665 .8615742 6.53 10.7 | N = 319 between | .7618745 7.222727 10.56364 | n = 29 within | .4243944 7.262665 9.802665 | T = 11 | | fpaide~d overall | 55.74514 6.781141 40.2 80.3 | N = 319 between | 6.316996 42.76364 76.41818 | n = 29 within | 2.708189 47.3815 65.3815 | T = 11
1. I used corr command to see if there's any correlation higher than 0.75. The result suggested no high correlation
Code:
corr pop65 lntfr lngdpcap lifeexp lnpopurban meanyearsedu fpaidemployed
Code:
. corr pop65 lntfr lngdpcap lifeexp lnpopurban meanyearsedu fpaidemployed (obs=319) | pop65 lntfr lngdpcap lifeexp lnpopu~n meanye~u fpaide~d -------------+--------------------------------------------------------------- pop65 | 1.0000 lntfr | -0.5376 1.0000 lngdpcap | -0.3408 -0.2165 1.0000 lifeexp | 0.4290 -0.6272 0.4329 1.0000 lnpopurban | 0.2166 -0.5700 0.5091 0.6036 1.0000 meanyearsedu | 0.0143 -0.2484 0.5286 0.3607 0.4317 1.0000 fpaidemplo~d | 0.2707 -0.1676 -0.0185 0.1182 0.0852 -0.0450 1.0000
Code:
reg pop65 lntfr lngdpcap lifeexp lnpopurban meanyearsedu fpaidemployed vif
Code:
. reg pop65 lntfr lngdpcap lifeexp lnpopurban meanyearsedu fpaidemployed Source | SS df MS Number of obs = 319 -------------+---------------------------------- F(6, 312) = 96.22 Model | 694.035273 6 115.672546 Prob > F = 0.0000 Residual | 375.077162 312 1.20217039 R-squared = 0.6492 -------------+---------------------------------- Adj R-squared = 0.6424 Total | 1069.11244 318 3.36198879 Root MSE = 1.0964 ------------------------------------------------------------------------------- pop65 | Coefficient Std. err. t P>|t| [95% conf. interval] --------------+---------------------------------------------------------------- lntfr | -4.596609 .6382869 -7.20 0.000 -5.8525 -3.340718 lngdpcap | -2.370571 .1511881 -15.68 0.000 -2.668048 -2.073094 lifeexp | .3157796 .0363818 8.68 0.000 .244195 .3873642 lnpopurban | .2589153 .2316014 1.12 0.264 -.1967828 .7146134 meanyearsedu | .268064 .0867419 3.09 0.002 .0973909 .4387371 fpaidemployed | .0414489 .009249 4.48 0.000 .0232507 .0596471 _cons | 6.690743 2.726136 2.45 0.015 1.326807 12.05468 ------------------------------------------------------------------------------- . vif Variable | VIF 1/VIF -------------+---------------------- lnpopurban | 2.09 0.477930 lifeexp | 2.08 0.480925 lntfr | 1.93 0.519335 lngdpcap | 1.70 0.586788 meanyearsedu | 1.48 0.676855 fpaidemplo~d | 1.04 0.961055 -------------+---------------------- Mean VIF | 1.72
Code:
xtreg pop65 lntfr lngdpcap lifeexp lnpopurban meanyearsedu fpaidemployed, fe
Code:
. xtreg pop65 lntfr lngdpcap lifeexp lnpopurban meanyearsedu fpaidemployed, fe Fixed-effects (within) regression Number of obs = 319 Group variable: prov_id Number of groups = 29 R-squared: Obs per group: Within = 0.3317 min = 11 Between = 0.1525 avg = 11.0 Overall = 0.1620 max = 11 F(6, 284) = 23.49 corr(u_i, Xb) = -0.1498 Prob > F = 0.0000 ------------------------------------------------------------------------------- pop65 | Coefficient Std. err. t P>|t| [95% conf. interval] --------------+---------------------------------------------------------------- lntfr | -2.311204 .6861332 -3.37 0.001 -3.661756 -.960652 lngdpcap | .3397021 .3815607 0.89 0.374 -.4113438 1.090748 lifeexp | .3442395 .1246646 2.76 0.006 .0988558 .5896233 lnpopurban | -.3996937 .5352642 -0.75 0.456 -1.453282 .6538948 meanyearsedu | -.1877593 .1341237 -1.40 0.163 -.4517621 .0762434 fpaidemployed | -.0175182 .0089955 -1.95 0.052 -.0352244 .0001881 _cons | -15.87607 7.224003 -2.20 0.029 -30.09545 -1.656684 --------------+---------------------------------------------------------------- sigma_u | 1.6829894 sigma_e | .38975102 rho | .94909934 (fraction of variance due to u_i) ------------------------------------------------------------------------------- F test that all u_i=0: F(28, 284) = 78.04 Prob > F = 0.0000
Code:
xtreg pop65 lntfr lngdpcap lifeexp lnpopurban meanyearsedu fpaidemployed, re
Code:
xtreg pop65 lntfr lngdpcap lifeexp lnpopurban meanyearsedu fpaidemployed, re Random-effects GLS regression Number of obs = 319 Group variable: prov_id Number of groups = 29 R-squared: Obs per group: Within = 0.3127 min = 11 Between = 0.4259 avg = 11.0 Overall = 0.4189 max = 11 Wald chi2(6) = 152.56 corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000 ------------------------------------------------------------------------------- pop65 | Coefficient Std. err. z P>|z| [95% conf. interval] --------------+---------------------------------------------------------------- lntfr | -3.135798 .6510992 -4.82 0.000 -4.411929 -1.859667 lngdpcap | -.649408 .302663 -2.15 0.032 -1.242617 -.0561994 lifeexp | .3743336 .0849561 4.41 0.000 .2078228 .5408444 lnpopurban | -.0605336 .4189387 -0.14 0.885 -.8816384 .7605712 meanyearsedu | -.0495519 .1127146 -0.44 0.660 -.2704684 .1713646 fpaidemployed | -.0099397 .0086585 -1.15 0.251 -.0269101 .0070307 _cons | -9.807787 5.293423 -1.85 0.064 -20.18271 .5671316 --------------+---------------------------------------------------------------- sigma_u | 1.1319697 sigma_e | .38975102 rho | .89401384 (fraction of variance due to u_i) -------------------------------------------------------------------------------
Code:
quietly xtreg pop65 lntfr lngdpcap lifeexp lnpopurban meanyearsedu fpaidemployed, fe estimates store fixed quietly xtreg pop65 lntfr lngdpcap lifeexp lnpopurban meanyearsedu fpaidemployed, re estimates store random hausman fixed random, sigmamore
Code:
. hausman fixed random, sigmamore ---- Coefficients ---- | (b) (B) (b-B) sqrt(diag(V_b-V_B)) | fixed random Difference Std. err. -------------+---------------------------------------------------------------- lntfr | -2.311204 -3.135798 .8245942 .2795573 lngdpcap | .3397021 -.649408 .9891101 .252318 lifeexp | .3442395 .3743336 -.030094 .0967322 lnpopurban | -.3996937 -.0605336 -.3391601 .360623 meanyearsedu | -.1877593 -.0495519 -.1382075 .0805032 fpaidemplo~d | -.0175182 -.0099397 -.0075785 .0033659 ------------------------------------------------------------------------------ b = Consistent under H0 and Ha; obtained from xtreg. B = Inconsistent under Ha, efficient under H0; obtained from xtreg. Test of H0: Difference in coefficients not systematic chi2(6) = (b-B)'[(V_b-V_B)^(-1)](b-B) = 25.08 Prob > chi2 = 0.0003
Code:
xtreg pop65 lntfr lngdpcap lifeexp lnpopurban meanyearsedu fpaidemployed, fe xttest3
Code:
. xttest3 Modified Wald test for groupwise heteroskedasticity in fixed effect regression model H0: sigma(i)^2 = sigma^2 for all i chi2 (29) = 73302.94 Prob>chi2 = 0.0000
Code:
xtserial pop65 lntfr lngdpcap lifeexp lnpopurban meanyearsedu fpaidemployed
Code:
. xtserial pop65 lntfr lngdpcap lifeexp lnpopurban meanyearsedu fpaidemployed Wooldridge test for autocorrelation in panel data H0: no first-order autocorrelation F( 1, 28) = 638.418 Prob > F = 0.0000
Code:
xtreg pop65 lntfr lngdpcap lifeexp lnpopurban meanyearsedu fpaidemployed, fe estat vce, corr
Code:
. estat vce, corr Correlation matrix of coefficients of xtreg model e(V) | lntfr lngdpcap lifeexp lnpopu~n meanye~u fpaide~d _cons -------------+--------------------------------------------------------------------- lntfr | 1.0000 lngdpcap | 0.2507 1.0000 lifeexp | 0.1683 -0.3903 1.0000 lnpopurban | -0.2512 -0.3562 -0.0661 1.0000 meanyearsedu | 0.1599 -0.2257 -0.6059 0.2122 1.0000 fpaidemplo~d | -0.0733 -0.2164 0.0498 0.3418 0.1803 1.0000 _cons | -0.3810 0.0473 -0.8918 -0.0323 0.6079 -0.1255 1.0000
Code:
xtreg pop65 lntfr lngdpcap lifeexp lnpopurban meanyearsedu fpaidemployed, fe vif, uncentered
Code:
. vif, uncentered Variable | VIF 1/VIF -------------+---------------------- lngdpcap | 633.03 0.001580 lifeexp | 586.08 0.001706 lnpopurban | 192.76 0.005188 meanyearsedu | 142.67 0.007009 fpaidemplo~d | 68.55 0.014588 lntfr | 56.15 0.017811 -------------+---------------------- Mean VIF | 279.87
Code:
xtreg pop65 lntfr lngdpcap lifeexp lnpopurban meanyearsedu fpaidemployed, fe xtreg pop65 lntfr lngdpcap lifeexp lnpopurban meanyearsedu fpaidemployed, re xtoverid
Code:
xtoverid Test of overidentifying restrictions: fixed vs random effects Cross-section time-series model: xtreg re Sargan-Hansen statistic 26.746 Chi-sq(6) P-value = 0.0002
I apologize for the long question. In summary, my problems are heteroskedasticity, autocorrelation, indication of multicollinearity (although I am not sure since the initial corr result doesn't show any high correlation between independent variables), and insignificant variables. I have tried many variables combinations and the results are the same: insignificant variables and heteroskedasticity etc.
Thank you so much for your time and assistance.
Comment