Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Durbin-Wu-Hausman test in gravity model

    Dear Joao Santos Silva, Tom Zylkin and Jeff Wooldridge,

    I am studying the effects of technical non-tariff measures on Peruvian exporters in 2016. For this purpose, I am using the gravity model and the PPML estimator. There is my main code line:

    Code:
    ppmlhdfe exports lndistancia frontera lenguaje lnTCRB mediterraneo ALC ln_arancel indice_de_prevalencia_TOT indice_de_prevalencia_TOT_PTA, a(importadores seccionhs) vce(cluster distancia)
    Where:
    exports is my dependent variable which contains the value of Peruvian exports in millions at HS-4 digit-level (the 60% of the observations take the value of zero).
    lndistancia, frontera, lenguaje, lnTCRB, mediterraneo, ALC are common variables that are included in gravity models (i.e. geographical, cultural, economic and institutional variables)
    ln_arancel represents the log of MFN tariff
    indice_de_prevalencia_TOT, indice_de_prevalencia_TOT_PTA are my principal variables that 1) account for the number of technical non-tariff measures at HS-4 digit-level and 2) the inclusion of some provisions about technical non-tariff measures in Free Trade Agreements that are signed by Peru until 2016.
    a(importadores seccionhs), vce(cluster distancia) account for the fixed effects for each of the 50 importers and the 98 HS-2 digit-level included in the sample, and I clustered for bilateral distance.

    After the estimation, my thesis advisor asked me about the possible existence of reverse causality (potential endogeneity issue) between my main independent variable (indice_de_prevalencia_TOT) and my dependent variable (exports). This in the sense that some Peruvian partners would have imposed more non-tariff measures on the highest Peruvian export values with protectionist purposes. So, I looked for some endogeneity tests that can be applied such as Durbin-Wu-Hausman test which consists in 3 parts:

    1) run a OLS regression with potential endogenous variable on the instrument (non tariff measures that are applied by Peruvian partners on its Latin American neighbors) and some extra independent variables:

    Code:
    reg indice_de_prevalencia_TOT indice_de_prevalencia_otros ln_arancel indice_de_prevalencia_TOT_PTA indice_de_prevalencia_otros_PTA imp_dum_* seccionhs2_dum_*, cluster (distancia)
    2) predict the residuals of the first estimation:

    Code:
    predict respre, residuals
    3) include the residuals as an extra term in the original equation and run a OLS:

    Code:
    reg ln_exportaciones respre ln_arancel indice_de_prevalencia_TOT indice_de_prevalencia_TOT_PTA imp_dum_* seccionhs2_dum_*, cluster(distancia) robust
    However, I have some questions about this procedure:

    1) Is it correct to include fixed effects and clustering in DWH test?
    2) Since my original dependent variable is exports in levels and not in logs (as it is common to use in new gravity literature), can I change the OLS technique by PPML estimator? Is it possible to do this change in both stages?
    3) If it is not possible to use the DWH test with PPML at any stage, what are the possible consequences of estimating the DWH test with OLS (i.e. with my dependent variable in logs)? Can I conclude that my main independent variable (indice_de_prevalencia_TOT) is exogenous whatever the estimator is?

    Regards,
    Juan


  • #2
    Dear Juan Quicana,

    Others will know more about this than I do, but the problem with the DWH is that its results depend on the quality of the instrument and it is very difficult to find credible instruments in this context (the one you propose using is likely to be invalid). The standard approach to address this problem is to include also pair-fixed effects following

    Baier, S.L., and Bergstrand, J.H. (2007). "Do Free Trade Agreements Actually Increase Members' International Trade?," Journal of International Economics 71, 72-95.

    However, you should keep in mind that this is certainly not a perfect solution.

    Best wishes,

    Joao

    Comment


    • #3
      Dear Juan,

      Regarding the Hausman test, I think you should check out this post by Jeff: https://www.statalist.org/forums/for...-fixed-effects.

      Note that his approach is similar to yours except PPML is used in step 3. Yes use the same fixed effects in step 1. More simply, perform your original estimation only adding an additional covariate for the fitted residual from step 1. Keep everything else the same.

      Regards,
      Tom

      Comment


      • #4
        Hi Tom Zylkin and Jeff Wooldridge ,

        I restart this thread because I have some doubts about the correct application of the Hausman test. I reviewed the link you posted in #3, and I solved many of my previous doubts. However, I am not sure if I am applying in a correct way the Hausman test, since I have an interaction term in my model (potential endogeneus variable is interacting with an independent variable). How can I deal with this trouble?

        My basic model is as follows:

        Code:
        pplmhdfe Y X1 X1#X2 controls, a(importers HSsection) vce(cluster distance)
        Where Y is a continuous variable, X1 is continuous and the potential endogenuous variable, and X2 is a dummy. My instrument is Z1 for X1, so Z1#X2 is for the interaction term X1#X2.

        I tried the next code:

        Code:
        reg X1 Z1 controls i.section i.importer, cluster(distance)
        predict res1, residuals
        
        reg X1#X2 Z1#X2 controls i.section i.importer, cluster(distance)
        predict res2, residuals
        
        ppmlhdfe Y X1 X1#X2 res1 res2 controls, a(section importer) vce(cluster distance)
        After this, is it correct to evaluate the p-values of res1 and res2 for reject or not the H0 of exogeneity?

        Thanks in advance!

        Comment


        • #5
          Juan: You should not estimate a control function for nonlnear functions of X1. You should only obtain res1, and that accounts for endogeneity of X1 everywhere. Of course, it relies on the functional form being correct. You might try interacting res1 with X2 and maybe even other control variables. (I assume X2 is endogenous here.) The important thing is res2 doesn't belong in the ppmlhdfe estimation.

          Comment


          • #6
            Hi Jeff Wooldridge,

            Thanks for your response. So, I should use as follows, right?:

            Code:
             
             reg X1 Z1 controls i.section i.importer, cluster(distance) predict res1, residuals  ppmlhdfe Y X1 X1#X2 res1 res1#X2 controls, a(section importer) vce(cluster distance)
            But, I'm not sure if, is there any problem if X2 is an exogenous variable (and a dummy, as well)?. What are the benefits of interacting 'res1' with other controls?

            Thanks in advance!

            Comment


            • #7
              Sorry, I made a mistake in the code. This is the correct:

              Code:
              reg X1 Z1 controls i.section i.importer, cluster (distance)
              
              predict res1, residuals
              
              ppmlhdfe Y X1 X1#X2 res1 res1#X2 controls, a(section importer) vce(cluster distance)

              Comment

              Working...
              X