Dear all,

I am working on my dissertation project that is aimed to estimate an econometric model to find the determinants of CO2 emissions for Brazilian states between 2005-2015, except 2008 (N=27, T=10). According to the literature, my variables are likely to be endogenous so I decided to use the System-GMM approach due to its good properties in small samples when compared to other panel data estimators.

Firstly I estimated a dynamic model with the lagged dependent variable in the right-hand side of the equation and as a result, I found different outcomes in terms of the expected signs and statistical significance of coefficients. Then I run additional tests in my variables which showed a strong multicollinearity between the lagged dependent variable and some regressors which can bias my estimates.

Due to the issue described above, I dropped out the lagged dependent variable and run the model without the dynamic term and the result was more reasonable according to the specific literature. However, I am not confident whether my equation is well specified and specification tests are acceptable.

The estimated equation follows:

My results:

My questions are:

1) Is my equation correctly specified?

1) Aside from the fact that the interpretation of my results is now reasonable, considering only the specification results of the model (AR(1), AR(2), Hansen and Diff-Hansen) is it possible to assume that I have consistent estimates?

2) Considering that my dataset is composed of 270 observations does the non-dynamic system-gmm estimation really take all into consideration?

Any help would be welcome.

Regards,

I am working on my dissertation project that is aimed to estimate an econometric model to find the determinants of CO2 emissions for Brazilian states between 2005-2015, except 2008 (N=27, T=10). According to the literature, my variables are likely to be endogenous so I decided to use the System-GMM approach due to its good properties in small samples when compared to other panel data estimators.

Firstly I estimated a dynamic model with the lagged dependent variable in the right-hand side of the equation and as a result, I found different outcomes in terms of the expected signs and statistical significance of coefficients. Then I run additional tests in my variables which showed a strong multicollinearity between the lagged dependent variable and some regressors which can bias my estimates.

Due to the issue described above, I dropped out the lagged dependent variable and run the model without the dynamic term and the result was more reasonable according to the specific literature. However, I am not confident whether my equation is well specified and specification tests are acceptable.

The estimated equation follows:

Code:

xtabond2 I P Arpc V QR year2 year3 year5-year11, gmm(P Arpc V QR, laglimits(1 2) collapse equation(diff)) gmm(P Arpc V QR, laglimits(1 1) collapse eq(level)) ivstyle(year2 year3 year5-year11, eq(level)) twostep small robust orthog

Code:

. Dynamic panel-data estimation, two-step system GMM Group variable: Ufs Number of obs = 270 Time variable : AnoStata Number of groups = 27 Number of instruments = 22 Obs per group: min = 10 F(13, 26) = 84.30 avg = 10.00 Prob > F = 0.000 max = 10 Corrected I Coef. Std. Err. t P>t [95% Conf. Interval] P .810655 .1350605 6.00 0.000 .5330342 1.088276 Arpc .8631221 .1800172 4.79 0.000 .4930914 1.233153 V .1794541 .2069648 0.87 0.394 -.245968 .6048763 QR .2443307 .1923399 1.27 0.215 -.1510296 .639691 year2 -.1213536 .050725 -2.39 0.024 -.2256202 -.0170869 year3 -.0953036 .103279 -0.92 0.365 -.3075966 .1169895 year5 -.216924 .20192 -1.07 0.293 -.6319765 .1981284 year6 -.2111071 .247421 -0.85 0.401 -.7196882 .2974741 year7 -.2329856 .2552947 -0.91 0.370 -.7577515 .2917802 year8 -.178021 .262647 -0.68 0.504 -.7178996 .3618576 year9 -.1452306 .2614932 -0.56 0.583 -.6827376 .3922764 year10 -.1175916 .2612318 -0.45 0.656 -.6545612 .419378 year11 -.1408464 .2415388 -0.58 0.565 -.6373365 .3556437 _cons -2.782019 1.229473 -2.26 0.032 -5.309237 -.2548011 Instruments for orthogonal deviations equation GMM-type (missing=0, separate instruments for each period unless collapsed) L(1/2).(P Arpc V QR) collapsed Instruments for levels equation Standard year2 year3 year5 year6 year7 year8 year9 year10 year11 _cons GMM-type (missing=0, separate instruments for each period unless collapsed) DL.(P Arpc V QR) collapsed Arellano-Bond test for AR(1) in first differences: z = -0.41 Pr > z = 0.685 Arellano-Bond test for AR(2) in first differences: z = 1.31 Pr > z = 0.190 Sargan test of overid. restrictions: chi2(8) = 14.20 Prob > chi2 = 0.077 (Not robust, but not weakened by many instruments.) Hansen test of overid. restrictions: chi2(8) = 4.75 Prob > chi2 = 0.784 (Robust, but weakened by many instruments.) Difference-in-Hansen tests of exogeneity of instrument subsets: GMM instruments for levels Hansen test excluding group: chi2(4) = 2.15 Prob > chi2 = 0.709 Difference (null H = exogenous): chi2(4) = 2.60 Prob > chi2 = 0.626 gmm(P Arpc V QualidadeRodovias, collapse eq(diff) lag(1 2)) Hansen test excluding group: chi2(0) = 0.18 Prob > chi2 = . Difference (null H = exogenous): chi2(8) = 4.57 Prob > chi2 = 0.803 gmm(P Arpc V QualidadeRodovias, collapse eq(level) lag(1 1)) Hansen test excluding group: chi2(4) = 2.15 Prob > chi2 = 0.709 Difference (null H = exogenous): chi2(4) = 2.60 Prob > chi2 = 0.626

My questions are:

1) Is my equation correctly specified?

1) Aside from the fact that the interpretation of my results is now reasonable, considering only the specification results of the model (AR(1), AR(2), Hansen and Diff-Hansen) is it possible to assume that I have consistent estimates?

2) Considering that my dataset is composed of 270 observations does the non-dynamic system-gmm estimation really take all into consideration?

Any help would be welcome.

Regards,

## Comment