Dear Joao Santos Silva, Tom Zylkin and Jeff Wooldridge,
I am studying the effects of technical non-tariff measures on Peruvian exporters in 2016. For this purpose, I am using the gravity model and the PPML estimator. There is my main code line:
Where:
exports is my dependent variable which contains the value of Peruvian exports in millions at HS-4 digit-level (the 60% of the observations take the value of zero).
lndistancia, frontera, lenguaje, lnTCRB, mediterraneo, ALC are common variables that are included in gravity models (i.e. geographical, cultural, economic and institutional variables)
ln_arancel represents the log of MFN tariff
indice_de_prevalencia_TOT, indice_de_prevalencia_TOT_PTA are my principal variables that 1) account for the number of technical non-tariff measures at HS-4 digit-level and 2) the inclusion of some provisions about technical non-tariff measures in Free Trade Agreements that are signed by Peru until 2016.
a(importadores seccionhs), vce(cluster distancia) account for the fixed effects for each of the 50 importers and the 98 HS-2 digit-level included in the sample, and I clustered for bilateral distance.
After the estimation, my thesis advisor asked me about the possible existence of reverse causality (potential endogeneity issue) between my main independent variable (indice_de_prevalencia_TOT) and my dependent variable (exports). This in the sense that some Peruvian partners would have imposed more non-tariff measures on the highest Peruvian export values with protectionist purposes. So, I looked for some endogeneity tests that can be applied such as Durbin-Wu-Hausman test which consists in 3 parts:
1) run a OLS regression with potential endogenous variable on the instrument (non tariff measures that are applied by Peruvian partners on its Latin American neighbors) and some extra independent variables:
2) predict the residuals of the first estimation:
3) include the residuals as an extra term in the original equation and run a OLS:
However, I have some questions about this procedure:
1) Is it correct to include fixed effects and clustering in DWH test?
2) Since my original dependent variable is exports in levels and not in logs (as it is common to use in new gravity literature), can I change the OLS technique by PPML estimator? Is it possible to do this change in both stages?
3) If it is not possible to use the DWH test with PPML at any stage, what are the possible consequences of estimating the DWH test with OLS (i.e. with my dependent variable in logs)? Can I conclude that my main independent variable (indice_de_prevalencia_TOT) is exogenous whatever the estimator is?
Regards,
Juan
I am studying the effects of technical non-tariff measures on Peruvian exporters in 2016. For this purpose, I am using the gravity model and the PPML estimator. There is my main code line:
Code:
ppmlhdfe exports lndistancia frontera lenguaje lnTCRB mediterraneo ALC ln_arancel indice_de_prevalencia_TOT indice_de_prevalencia_TOT_PTA, a(importadores seccionhs) vce(cluster distancia)
exports is my dependent variable which contains the value of Peruvian exports in millions at HS-4 digit-level (the 60% of the observations take the value of zero).
lndistancia, frontera, lenguaje, lnTCRB, mediterraneo, ALC are common variables that are included in gravity models (i.e. geographical, cultural, economic and institutional variables)
ln_arancel represents the log of MFN tariff
indice_de_prevalencia_TOT, indice_de_prevalencia_TOT_PTA are my principal variables that 1) account for the number of technical non-tariff measures at HS-4 digit-level and 2) the inclusion of some provisions about technical non-tariff measures in Free Trade Agreements that are signed by Peru until 2016.
a(importadores seccionhs), vce(cluster distancia) account for the fixed effects for each of the 50 importers and the 98 HS-2 digit-level included in the sample, and I clustered for bilateral distance.
After the estimation, my thesis advisor asked me about the possible existence of reverse causality (potential endogeneity issue) between my main independent variable (indice_de_prevalencia_TOT) and my dependent variable (exports). This in the sense that some Peruvian partners would have imposed more non-tariff measures on the highest Peruvian export values with protectionist purposes. So, I looked for some endogeneity tests that can be applied such as Durbin-Wu-Hausman test which consists in 3 parts:
1) run a OLS regression with potential endogenous variable on the instrument (non tariff measures that are applied by Peruvian partners on its Latin American neighbors) and some extra independent variables:
Code:
reg indice_de_prevalencia_TOT indice_de_prevalencia_otros ln_arancel indice_de_prevalencia_TOT_PTA indice_de_prevalencia_otros_PTA imp_dum_* seccionhs2_dum_*, cluster (distancia)
Code:
predict respre, residuals
Code:
reg ln_exportaciones respre ln_arancel indice_de_prevalencia_TOT indice_de_prevalencia_TOT_PTA imp_dum_* seccionhs2_dum_*, cluster(distancia) robust
1) Is it correct to include fixed effects and clustering in DWH test?
2) Since my original dependent variable is exports in levels and not in logs (as it is common to use in new gravity literature), can I change the OLS technique by PPML estimator? Is it possible to do this change in both stages?
3) If it is not possible to use the DWH test with PPML at any stage, what are the possible consequences of estimating the DWH test with OLS (i.e. with my dependent variable in logs)? Can I conclude that my main independent variable (indice_de_prevalencia_TOT) is exogenous whatever the estimator is?
Regards,
Juan

Comment