Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Gravity model ppmlhdfe last estimates not found

    Hi all,

    I have been dealing with this set of data for a long time, but still can not get to final results.

    So at the iterative procedure, when I type in my codes, I only get last estimates not found. Which I cannot see why.

    Code:
    local s = 3    
            local sd_dif_change_pi = 1
            local max_dif_change_pi = 1
        while (`sd_dif_change_pi' > 0.01) | (`max_dif_change_pi' > 0.01) {
            local s_1 = `s' - 1
            local s_2 = `s' - 2
            local s_3 = `s' - 3
            
            gen trade_`s_1' =  tradehat_`s_2' * change_pricei_`s_2' * change_pricej_`s_2' / (change_OMR_FULL_`s_2'*change_IMR_FULL_`s_2')
            
    drop EXPORTER_FE* IMPORTER_FE*
                    quietly tabulate exporter, generate (EXPORTER_FE)
                    quietly tabulate importer, generate (IMPORTER_FE)
                capture ppmlhdfe trade_`s_1' EXPORTER_FE* IMPORTER_FE*, offset(ln_tij_CFL) noconst iter(30) 
                    predict tradehat_`s_1', mu
                    
                    bysort exporter: egen Y_`s_1' = total(tradehat_`s_1')
                    quietly generate tempE_`s_1' = phi * Y_`s_1' if exporter == importer
                        bysort importer: egen E_`s_1' = mean(tempE_`s_1')
                    quietly generate tempE_R_`s_1' = E_`s_1' if importer == "ZZZ"
                        egen E_R_`s_1' = mean(tempE_R_`s_1')
                        
                                    forvalues i = 1(1)$N_1 {
                        quietly replace EXPORTER_FE`i' = EXPORTER_FE`i' * exp(_b[EXPORTER_FE`i'])
                        quietly replace IMPORTER_FE`i' = IMPORTER_FE`i' * exp(_b[IMPORTER_FE`i'])
                    }
                    quietly replace EXPORTER_FE$N = EXPORTER_FE$N * exp(_b[EXPORTER_FE$N ])
                    egen exp_pi_`s_1' = rowtotal(EXPORTER_FE1-EXPORTER_FE$N ) 
                    quietly generate tempvar1 = exp_pi_`s_1' if exporter == importer
                        bysort importer: egen exp_pi_j_`s_1' = mean(tempvar1)     
                        
                        gen change_pricei_`s_1' = ((exp_pi_`s_1' / exp_pi_`s_2') / (E_R_`s_1' / E_R_`s_2'))^(1/(1-sigma))
                    gen change_pricej_`s_1' = ((exp_pi_j_`s_1' / exp_pi_j_`s_2') / (E_R_`s_1' / E_R_`s_2'))^(1/(1-sigma))
                    gen OMR_FULL_`s_1' = (Y_`s_1' * E_R_`s_1') / exp_pi_`s_1' 
                        gen change_OMR_FULL_`s_1' = OMR_FULL_`s_1' / OMR_FULL_`s_2'                    
                    egen exp_chi_`s_1' = rowtotal(IMPORTER_FE1-IMPORTER_FE$N )    
                    gen IMR_FULL_`s_1' = E_`s_1' / (exp_chi_`s_1' * E_R_`s_1')
                        gen change_IMR_FULL_`s_1' = IMR_FULL_`s_1' / IMR_FULL_`s_2'
                        
                        gen dif_change_pi_`s_1' = change_pricei_`s_2' - change_pricei_`s_3'
                        display "************************* iteration number " `s_2' " *************************"
                            summarize dif_change_pi_`s_1', format
                        display "**********************************************************************"
                        display " "
                            local sd_dif_change_pi = r(sd)
                            local max_dif_change_pi = abs(r(max))    
                            
                local s = `s' + 1
                drop temp* 
        }
    Thank you very much for your help!

  • #2
    Dear Alice Johns,

    Try replacing capture with qui before the ppmlhdfe command.

    Best wishes,

    Joao

    Comment


    • #3
      Dear Joao Santos Silva,

      My apologies for bothering you. I have encountered the same problem with the code from the Advanced Guide book. I followed your suggestion to replace capture with qui; however, I received the error "matrix not positive definite". I thought this might be due to using various exporter and importer fixed effects since I am considering many years. I tried to reduce the analysis period, but then encountered the error "convergence not achieved". The code works well until these errors occur within the loop. Then I tried using ppmlhdfe instead of ppml as described in this post. Everything goes smoothly until the process within the loop again. This time I got "insufficient observations" in all cases.

      I would greatly appreciate your assistance.

      Kind regards,

      Omar

      Comment


      • #4
        Dear Omar Perez,

        First of all, I reiterate my advice to use ppmlhdfe instead of ppml (please check the help file to make sure you use ppmlhdfe correctly and absorb the fixed effects). Then I would suggest running the code with the full data. If this still gives errors, please post the code and the error and we'll try to help.

        Best wishes,

        Joao

        Comment


        • #5
          Dear Joao Santos Silva,

          Thank you for your response. I rewrote the code, but I still encountered the same error. I am trying to replicate the first application (Trade without Borders) from the second chapter of the book. Therefore, I adapted the code accordingly. I am only considering a four-year interval since including all years could be computationally demanding for testing purposes. The three databases I am using are from CEPII: Gravity as the main database, TradeProd (TPc) for intranational trade, and GeoDist for distance and contiguity. One detail you suggested taking into account was that some databases, as in this case, do not consider zero trade flows. Hence, I am unsure if a given missing value is due to missing data or a zero trade flow.

          An important detail I noted when using ppmlhdfe was that, in comparison with ppml, the dummy variables for contiguity and international trade (0 if intranational) are omitted due to collinearity. This was the first issue I experienced. The other issue arose within the loop when using both ppmlhdfe and ppml, so I suspect I made an error when adapting the code, although it works well.

          The code is as follows:

          Code:
          use Gravity_V202211.dta, clear
          rename country_id_o exporter
          rename country_id_d importer
          
          keep year exporter importer distw_arithmetic dist gdp_o gdp_d tradeflow_comtrade_o tradeflow_comtrade_d tradeflow_imf_o tradeflow_imf_d iso3num_o iso3num_d
          
          rename tradeflow_comtrade_o trade
          rename dist DIST
          
          keep if year >= 1962 & year <= 2020
          
          // Adding intranational trade //
          * In the other file I did this
          * collapse (sum) trade_sq_yr, by(year iso3_tp_o iso3_tp_d)
          * gen trade_sq_yr_1000 = trade_sq_yr * 1000 (to convert millions into thousands)
          
          merge 1:1 year exporter importer using TPc_V202401.dta
          
          keep if _merge == 1 | _merge == 3
          drop _merge
          
          gen trade_combined = trade_sq_yr_1000
          
          * Generating the variable that merges the data
          replace trade_combined = trade if !missing(trade)
          
          // Generating the dummies//
          
          merge m:1 exporter importer using dist_cepii.dta
          keep if _merge == 1 | _merge == 3
          
          drop comlang_off comlang_ethno colony comcol curcol col45 smctry dist distcap distw distwces _merge
          
          rename contig CNTG
          
          * Creating INTL variable (1 for international trade an 0 otherwise)
          gen INTL = 0
          
          replace INTL = 1 if exporter != importer
          
          // Organizing data //
          
          keep year exporter importer DIST gdp_o gdp_d trade_combined CNTG INTL iso3num_o iso3num_d
          rename trade_combined trade
          
          keep if mod(year - 1980, 4) == 0 & year >= 1980 & year <= 2020
          
          sort exporter year importer
          
          rename gdp_o Y
          rename gdp_d E
          generate ln_DIST = ln(DIST)
          
          // STEP 1: Solve the baseline gravity model //
          
          * Define the country of reference
          
          * Great Britain has more observations, so it would be the country of reference
          
          generate E_gbrBLN = E if importer == "GBR"
          replace exporter = "ZZZ" if exporter == "GBR"
          replace importer = "ZZZ" if importer == "GBR"
          egen E_deu = mean(E_gbrBLN)
          
          * Estimate the gravity model with the PPML estimator
          
          cap egen imp = group(iso3num_d)
          cap egen exp = group(iso3num_o)
          
          ppmlhdfe trade ln_DIST CNTG INTL, a(imp#year exp#year imp#exp, savefe) cluster(imp#exp) d(sum_fe) nolog
          predict tradehat_BLN, mu
          
          rename __hdfe2__ expfe
          rename __hdfe1__ impfe
          
          * Construct the variables for export- and import-fixed effects
          
          gen EXPORTER_FE = exp(expfe)
          gen IMPORTER_FE = exp(impfe)
          
          egen exp_pi_BLN = total(EXPORTER_FE)
          egen exp_chi_BLN = total(IMPORTER_FE)
          
          * Compute the variables of bilateral trade costs and multilateral resistances
          generate tij_BLN = exp(_b[ln_DIST] * ln_DIST + _b[CNTG] * CNTG + _b[INTL] * INTL)
          generate OMR_BLN = Y * E_deu / exp_pi_BLN
          generate IMR_BLN = E / (exp_chi_BLN * E_deu)
          
          * Compute the estimated international trade for given output and expenditures
          generate tempXi_BLN = tradehat_BLN if exporter != importer
          bysort exporter: egen Xi_BLN = sum(tempXi_BLN)
          
          // STEP 2: Define a counterfactual scenario //
          
          * define a new counterfactual border variable
          generate INTL_CFL = 0
          generate tij_CFL = exp(_b[ln_DIST]*ln_DIST + _b[CNTG]*CNTG + _b[INTL]*INTL_CFL)
          
          * Generate the logged trade costs used in the constraint
          generate ln_tij_CFL = log(tij_CFL)
          
          // STEP 3: Solve the counterfactual model //
          
          // Conditional general equilibrium effects
          
          * Re-create a new set of exporter and importer fixed effects
          drop sum_fe expfe impfe EXPORTER_FE IMPORTER_FE
          
          * Estimate the constrained gravity model with the PPML estimator
          ppmlhdfe trade, a(imp#year exp#year imp#exp, savefe) cluster(imp#exp) d(sum_fe) offset(ln_tij_CFL) nolog
          predict tradehat_CD, mu
          
          rename __hdfe2__ expfe
          rename __hdfe1__ impfe
          
          * Construct the variables for export- and import-fixed effects
          gen EXPORTER_FE = exp(expfe)
          gen IMPORTER_FE = exp(impfe)
          
          egen exp_pi_CD = total(EXPORTER_FE)
          egen exp_chi_CD = total(IMPORTER_FE)
          
          * Compute the conditional general equilibrium effects of multilateral resistances
          generate OMR_CD = Y * E_deu / exp_pi_CD
          generate IMR_CD = E / (exp_chi_CD * E_deu)
          
          * Compute the conditional general equilibrium effects of trade
          generate tempXi_CD = tradehat_CD if exporter != importer
          bysort exporter: egen Xi_CD = sum(tempXi_CD)
          
          // Full endowment general equilibrium effects
          
          scalar phi = 1
          scalar sigma = 5
          
          * Initiate the first iteration (this is something I had to add)
          generate change_tij = tij_CFL / tij_BLN
          generate trade_1 = change_tij * tradehat_BLN * (exp_pi_BLN / exp_pi_CD) * (exp_chi_BLN / exp_chi_CD) / (OMR_CD * IMR_CD)
          
          drop sum_fe expfe impfe EXPORTER_FE IMPORTER_FE
          
          ppmlhdfe trade_1, a(imp#year exp#year imp#exp, savefe) cluster(imp#exp) d(sum_fe) offset(ln_tij_CFL) nolog
          predict tradehat_1, mu
          
          bysort exporter: egen Y_1 = total(tradehat_1)
          generate tempE_1 = phi * Y_1 if exporter == importer
          bysort importer: egen E_1 = mean(tempE_1)
          generate tempE_deu_1 = E_1 if importer == "ZZZ"
          egen double E_deu_1 = mean(tempE_deu_1)
          
          rename __hdfe2__ expfe
          rename __hdfe1__ impfe
          
          gen EXPORTER_FE = exp(expfe)
          gen IMPORTER_FE = exp(impfe)
          
          egen exp_pi_1 = total(EXPORTER_FE)
          egen exp_chi_1 = total(IMPORTER_FE)
          
          generate tempvar1_BLN = exp_pi_BLN if exporter == importer
          bysort importer: egen exp_pi_j_BLN = mean(tempvar1_BLN)
          
          generate tempvar1 = exp_pi_1 if exporter == importer
          bysort importer: egen exp_pi_j_1 = mean(tempvar1)
          
          generate tempvar1_CD = exp_pi_CD if exporter == importer
          bysort importer: egen exp_pi_j_CD = mean(tempvar1_CD)
          
          * Generate change_pricei_BLN and change_pricei_CD for comparison
          generate change_pricei_BLN = (exp_pi_BLN / exp_pi_BLN)^(1/(1-sigma))
          generate change_pricei_CD = (exp_pi_CD / exp_pi_BLN)^(1/(1-sigma))
          
          * Generate change_pricej_BLN and change_pricej_CD for comparison
          generate change_pricej_BLN = (exp_pi_j_BLN / exp_pi_j_BLN)^(1/(1-sigma))
          generate change_pricej_CD = (exp_pi_j_CD / exp_pi_j_BLN)^(1/(1-sigma))
          
          generate change_pricei_1 = ((exp_pi_1 / exp_pi_BLN) / (E_deu_1 / E_deu))^(1/(1-sigma))
          generate change_pricej_1 = ((exp_pi_j_1 / exp_pi_j_BLN) / (E_deu_1 / E_deu))^(1/(1-sigma))
          generate OMR_FULL_1 = (Y_1 * E_deu_1) / exp_pi_1
          generate change_OMR_FULL_1 = OMR_FULL_1 / OMR_CD
          generate IMR_FULL_1 = E_1 / (exp_chi_1 * E_deu_1)
          generate change_IMR_FULL_1 = IMR_FULL_1 / IMR_CD
          generate dif_change_p_1 = change_pricei_BLN - change_pricei_CD
          
          * This is also something I had to add because of the line in the loop: generate dif_change_p_`s_1' = change_pricei_`s_2' - change_pricei_`s_3'
          generate change_pricei_0 = 1
          
          * Set the criteria of convergence
          local s = 3
          local sd_dif_change_p = 1
          local max_dif_change_p = 1
          while (`sd_dif_change_p' > 0.001) | (`max_dif_change_p' > 0.001) {
          local s_1 = `s' - 1
          local s_2 = `s' - 2
          local s_3 = `s' - 3
          * i. Create the new dependent variable and estimate the gravity model with PPML
          generate trade_`s_1' = change_tij * tradehat_`s_2' * change_pricei_`s_2' *change_pricej_`s_2' / (change_OMR_FULL_`s_2' * change_IMR_FULL_`s_2')
          
          drop sum_fe expfe impfe EXPORTER_FE IMPORTER_FE
          qui ppmlhdfe trade_`s_1', a(imp#year exp#year imp#exp, savefe) cluster(imp#exp) d(sum_fe) offset(ln_tij_CFL) nolog
          predict tradehat_`s_1', mu
          
          * ii. Update output and expenditures
          bysort exporter: egen Y_`s_1' = total(tradehat_`s_1')
          generate tempE_`s_1' = phi * Y_`s_1' if exporter == importer
          bysort importer: egen E_`s_1' = mean(tempE_`s_1')
          generate tempE_deu_`s_1' = E_`s_1' if importer == "ZZZ"
          egen double E_deu_`s_1' = mean(tempE_deu_`s_1')
          
          * iii. Update factory-gate prices and multilateral resistances (I did modifications here replacing tempvar1 with tempvar_`s_1')
          rename __hdfe2__ expfe
          rename __hdfe1__ impfe
          
          gen EXPORTER_FE = exp(expfe)
          gen IMPORTER_FE = exp(impfe)
          
          egen exp_pi_`s_1' = total(EXPORTER_FE)
          egen exp_chi_`s_1' = total(IMPORTER_FE)
          
          generate tempvar_`s_1' = exp_pi_`s_1' if exporter == importer
          bysort importer: egen exp_pi_j_`s_1' = mean(tempvar_`s_1')
          generate change_pricei_`s_1' = ((exp_pi_`s_1' / exp_pi_`s_2') / (E_deu_`s_1' / E_deu_`s_2'))^(1/(1-sigma))
          generate change_pricej_`s_1' = ((exp_pi_j_`s_1' / exp_pi_j_`s_2') / (E_deu_`s_1' / E_deu_`s_2'))^(1/(1-sigma))
          generate OMR_FULL_`s_1' = (Y_`s_1' * E_deu_`s_1') / exp_pi_`s_1'
          generate change_OMR_FULL_`s_1' = OMR_FULL_`s_1' / OMR_FULL_`s_2'
          generate IMR_FULL_`s_1' = E_`s_1' / (exp_chi_`s_1' * E_deu_`s_1')
          generate change_IMR_FULL_`s_1' = IMR_FULL_`s_1' / IMR_FULL_`s_2'
          * iv. Iterate until the change in factory-gate prices has converged to zero
          generate dif_change_p_`s_1' = change_pricei_`s_2' - change_pricei_`s_3'
          summarize dif_change_p_`s_1'
          local sd_dif_change_p = r(sd)
          local max_dif_change_p = abs(r(max))
          local s = `s' + 1
          drop temp*
          }
          The error message I got then was this

          Code:
              Variable |        Obs        Mean    Std. dev.       Min        Max
          -------------+---------------------------------------------------------
          dif_change~7 |    698,544    7.571872           0   7.571872   7.571872
          (698,544 missing values generated)
          insufficient observations
          r(2001);
          
          end of do-file
          On the other hand, when using ppml, the code worked well until the process within the loop. Therefore, I’m almost sure there might be an issue in my adaptation that I haven’t noticed.

          I would greatly appreciate any help or guidance you could provide. I look forward to your response.

          Kind regards,

          Omar

          Comment


          • #6
            Dear Omar Perez,

            From these results, it looks as if the loop runs up to iteration 7 and then it stops because you generate a variables where all observations are missing. For example, in step i. you may be dividing by zero. So, what I suggest is that you add commands summarizing the variables being used in that step and see which one causes the problem. If the problem is not there, the next step is to repeat the process at other stages of the loop until you find the problem.

            Best wishes,

            Joao

            Comment


            • #7
              Dear Joao Santos Silva,

              Thank you for your advice. Indeed, the variables were too big, so Stata could not make the calculations. I have already solved that error by using double for generating all the variables within the loop. Nevertheless, another unexpected error occurred.

              Here are the results:

              Code:
                 Variable |        Obs        Mean    Std. dev.       Min        Max
              -------------+---------------------------------------------------------
              dif_chang~11 |    698,544   -1.35e+15           0  -1.35e+15  -1.35e+15
              (475,587 missing values generated)
              warning: dependent variable takes very low values after standardizing (4.4306e-97)
              Converged in 26 iterations and 56 HDFE sub-iterations (tol = 1.0e-08)
              
              HDFE PPML regression                              No. of obs      =    222,957
              Absorbing 3 HDFE groups                           Residual df     =     24,981
              Statistics robust to heteroskedasticity           Wald chi2(0)    =          .
              Deviance             =  9.90128e-12               Prob > chi2     =          .
              Log pseudolikelihood = -4.95064e-12               Pseudo R2       = -5.945e+68
              
              Number of clusters (imp#exp)=     24,982
                                         (Std. err. adjusted for 24,982 clusters in imp#exp)
              ------------------------------------------------------------------------------
                           |               Robust
                  trade_12 | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
              -------------+----------------------------------------------------------------
                     _cons |  -28.98881          .        .       .            .           .
                ln_tij_CFL |          1  (offset)
              ------------------------------------------------------------------------------
              
              Absorbed degrees of freedom:
              -----------------------------------------------------+
               Absorbed FE | Categories  - Redundant  = Num. Coefs |
              -------------+---------------------------------------|
                  imp#year |      1787           1        1786     |
                  exp#year |      1689          11        1678     |
                   imp#exp |     24982       24982           0    *|
              -----------------------------------------------------+
              * = FE nested within cluster; treated as redundant for DoF computation
              (240,171 missing values generated)
              (695,772 missing values generated)
              (695,772 missing values generated)
              (475,587 missing values generated)
              (475,587 missing values generated)
              (695,772 missing values generated)
              (205,128 missing values generated)
              (205,128 missing values generated)
              (0 real changes made)
              
                  Variable |        Obs        Mean    Std. dev.       Min        Max
              -------------+---------------------------------------------------------
              dif_chang~12 |    698,544    1.94e+34           0   1.94e+34   1.94e+34
              (475,587 missing values generated)
                       _assert_abort():  3498  stdevs are missing; is N==1?
                          assert_msg():     -  function returned error
                 reghdfe_standardize():     -  function returned error
                 GLM::init_variables():     -  function returned error
                               <istmt>:     -  function returned error
              r(3498);
              I would greatly appreciate any suggestions you could provide.

              Kind regards,

              Omar

              Comment


              • #8
                Dear Joao Santos Silva,

                Fortunately, I managed to solve that problem. Thank you for your help regardless. I have another question that may not be directly related to this matter. If I want to analyze the effect of a Free Trade Agreement between only two countries, should I consider trade between these two countries and the rest of the world, or should I focus exclusively on the trade between the two countries? I just want to make sure I understand the correct approach.

                I hope you can provide an answer. Thank you in advance.

                Kind regards,

                Omar

                Comment


                • #9
                  Dear Omar,

                  That is up to you; it depends on what you want to do :-)

                  Best wishes,

                  Joao

                  Comment

                  Working...
                  X