Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help with gravity equation estimated using PPML (Stata 14.2)

    Good evening, everyone!

    The following is a snippet of my database( I have data of Brazil's exports and imports flow with 122 countries, from 2009 to 2014):

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str3(ID_EXP ID_IMP) int(ID_NUM ANO) double(EXP_NRB DIST SCO_EXP SCO_IMP) byte(FRO LIN COL)
    "BRA" "AGO"  1 2009  152903849 6391.615 .6991869918699187   .8211382113821139 0 1 0
    "BRA" "AGO"  1 2010   60699279 6391.615 .4796747967479675   .7317073170731707 0 1 0
    "BRA" "AGO"  1 2011   61883975 6391.615 .5934959349593496   .5528455284552846 0 1 0
    "BRA" "AGO"  1 2012   64203369 6391.615 .3821138211382114   .5528455284552846 0 1 0
    "BRA" "AGO"  1 2013   75264517 6391.615 .3821138211382114   .8373983739837398 0 1 0
    "BRA" "AGO"  1 2014   85859091 6391.615 .2845528455284553   .4959349593495935 0 1 0
    "BRA" "ALB"  2 2009      40668 9313.151 .6991869918699187   .8943089430894309 0 0 0
    "BRA" "ALB"  2 2010      64360 9313.151 .4796747967479675    .959349593495935 0 0 0
    "BRA" "ALB"  2 2011       1467 9313.151 .5934959349593496   .5447154471544715 0 0 0
    "BRA" "ALB"  2 2012        451 9313.151 .3821138211382114                   1 0 0 0
    "BRA" "ALB"  2 2013         54 9313.151 .3821138211382114   .3902439024390244 0 0 0
    "BRA" "ALB"  2 2014       1387 9313.151 .2845528455284553   .7154471544715447 0 0 0
    "BRA" "ARE"  3 2009   83546921 11809.98 .6991869918699187   .4065040650406504 0 0 0
    "BRA" "ARE"  3 2010   23600914 11809.98 .4796747967479675   .3170731707317073 0 0 0
    "BRA" "ARE"  3 2011   68742508 11809.98 .5934959349593496  .42276422764227645 0 0 0
    "BRA" "ARE"  3 2012   45312701 11809.98 .3821138211382114  .13821138211382114 0 0 0
    "BRA" "ARE"  3 2013   58465966 11809.98 .3821138211382114  .45528455284552843 0 0 0
    "BRA" "ARE"  3 2014  176933670 11809.98 .2845528455284553   .5691056910569106 0 0 0
    "BRA" "ARG"  4 2009 1199662929 2089.281 .6991869918699187   .5691056910569106 1 0 0
    "BRA" "ARG"  4 2010 1575475673 2089.281 .4796747967479675   .8292682926829268 1 0 0
    "BRA" "ARG"  4 2011 1850370799 2089.281 .5934959349593496   .6747967479674797 1 0 0
    "BRA" "ARG"  4 2012 1641368735 2089.281 .3821138211382114  .35772357723577236 1 0 0
    "BRA" "ARG"  4 2013 1328340316 2089.281 .3821138211382114   .3170731707317073 1 0 0
    "BRA" "ARG"  4 2014 1133068766 2089.281 .2845528455284553  .17886178861788618 1 0 0
    "BRA" "ARM"  5 2009         49  11214.9 .6991869918699187   .3252032520325203 0 0 0
    "BRA" "ARM"  5 2010      35856  11214.9 .4796747967479675   .7398373983739838 0 0 0
    "BRA" "ARM"  5 2011         24  11214.9 .5934959349593496  .12195121951219512 0 0 0
    "BRA" "ARM"  5 2012       6590  11214.9 .3821138211382114   .0975609756097561 0 0 0
    "BRA" "ARM"  5 2013         49  11214.9 .3821138211382114   .6016260162601627 0 0 0
    "BRA" "ARM"  5 2014        244  11214.9 .2845528455284553   .3252032520325203 0 0 0
    "BRA" "AUS"  6 2009   22302246 13983.36 .6991869918699187   .3008130081300813 0 0 0
    "BRA" "AUS"  6 2010   20209779 13983.36 .4796747967479675   .4959349593495935 0 0 0
    "BRA" "AUS"  6 2011   22328151 13983.36 .5934959349593496   .3170731707317073 0 0 0
    "BRA" "AUS"  6 2012   23223800 13983.36 .3821138211382114   .6991869918699187 0 0 0
    "BRA" "AUS"  6 2013   17347535 13983.36 .3821138211382114   .4878048780487805 0 0 0
    "BRA" "AUS"  6 2014   30864179 13983.36 .2845528455284553   .5528455284552846 0 0 0
    "BRA" "AUT"  7 2009    5313013 9395.406 .6991869918699187   .7235772357723578 0 0 0
    "BRA" "AUT"  7 2010    6330983 9395.406 .4796747967479675  .35772357723577236 0 0 0
    "BRA" "AUT"  7 2011    5698958 9395.406 .5934959349593496   .8455284552845529 0 0 0
    "BRA" "AUT"  7 2012    1664144 9395.406 .3821138211382114   .7073170731707317 0 0 0
    "BRA" "AUT"  7 2013    1395217 9395.406 .3821138211382114   .4634146341463415 0 0 0
    "BRA" "AUT"  7 2014   16199355 9395.406 .2845528455284553   .8699186991869918 0 0 0
    "BRA" "AZE"  8 2009      63514 11569.65 .6991869918699187   .7804878048780488 0 0 0
    "BRA" "AZE"  8 2010          0 11569.65 .4796747967479675    .967479674796748 0 0 0
    "BRA" "AZE"  8 2011        602 11569.65 .5934959349593496  .34959349593495936 0 0 0
    "BRA" "AZE"  8 2012       2265 11569.65 .3821138211382114   .3008130081300813 0 0 0
    "BRA" "AZE"  8 2013     219612 11569.65 .3821138211382114   .7804878048780488 0 0 0
    "BRA" "AZE"  8 2014      51436 11569.65 .2845528455284553   .4146341463414634 0 0 0
    "BRA" "BEL"  9 2009  117516351 8977.309 .6991869918699187   .3983739837398374 0 0 0
    "BRA" "BEL"  9 2010  111740378 8977.309 .4796747967479675  .14634146341463414 0 0 0
    "BRA" "BEL"  9 2011  188609683 8977.309 .5934959349593496   .7317073170731707 0 0 0
    "BRA" "BEL"  9 2012  164059354 8977.309 .3821138211382114   .7479674796747967 0 0 0
    "BRA" "BEL"  9 2013  114426142 8977.309 .3821138211382114  .15447154471544716 0 0 0
    "BRA" "BEL"  9 2014  125515938 8977.309 .2845528455284553   .7560975609756098 0 0 0
    "BRA" "BEN" 10 2009   16094862 5893.825 .6991869918699187   .0975609756097561 0 0 0
    "BRA" "BEN" 10 2010    4151754 5893.825 .4796747967479675  .07317073170731707 0 0 0
    "BRA" "BEN" 10 2011    9704038 5893.825 .5934959349593496   .0975609756097561 0 0 0
    "BRA" "BEN" 10 2012    1780809 5893.825 .3821138211382114  .14634146341463414 0 0 0
    "BRA" "BEN" 10 2013    4600869 5893.825 .3821138211382114   .2032520325203252 0 0 0
    "BRA" "BEN" 10 2014     280249 5893.825 .2845528455284553  .13008130081300814 0 0 0
    "BRA" "BGD" 11 2009    6610295 15292.22 .6991869918699187   .8617886178861789 0 0 0
    "BRA" "BGD" 11 2010    3745866 15292.22 .4796747967479675   .5853658536585366 0 0 0
    "BRA" "BGD" 11 2011    2797106 15292.22 .5934959349593496   .7398373983739838 0 0 0
    "BRA" "BGD" 11 2012     423424 15292.22 .3821138211382114   .8780487804878049 0 0 0
    "BRA" "BGD" 11 2013     887481 15292.22 .3821138211382114   .7967479674796748 0 0 0
    "BRA" "BGD" 11 2014    2398781 15292.22 .2845528455284553   .7642276422764228 0 0 0
    "BRA" "BGR" 12 2009     618688 9768.308 .6991869918699187  .25203252032520324 0 0 0
    "BRA" "BGR" 12 2010     673776 9768.308 .4796747967479675  .25203252032520324 0 0 0
    "BRA" "BGR" 12 2011     192542 9768.308 .5934959349593496  .06504065040650407 0 0 0
    "BRA" "BGR" 12 2012     270882 9768.308 .3821138211382114   .5609756097560976 0 0 0
    "BRA" "BGR" 12 2013     206898 9768.308 .3821138211382114   .6422764227642277 0 0 0
    "BRA" "BGR" 12 2014     255623 9768.308 .2845528455284553  .08130081300813008 0 0 0
    "BRA" "BHR" 13 2009     449527 11384.75 .6991869918699187  .06504065040650407 0 0 0
    "BRA" "BHR" 13 2010     406267 11384.75 .4796747967479675  .21138211382113822 0 0 0
    "BRA" "BHR" 13 2011     856167 11384.75 .5934959349593496  .17886178861788618 0 0 0
    "BRA" "BHR" 13 2012   10449143 11384.75 .3821138211382114   .3983739837398374 0 0 0
    "BRA" "BHR" 13 2013     209478 11384.75 .3821138211382114 .008130081300813009 0 0 0
    "BRA" "BHR" 13 2014     194060 11384.75 .2845528455284553  .12195121951219512 0 0 0
    "BRA" "BIH" 14 2009     323008 9351.396 .6991869918699187   .2032520325203252 0 0 0
    "BRA" "BIH" 14 2010     559483 9351.396 .4796747967479675  .11382113821138211 0 0 0
    "BRA" "BIH" 14 2011     416766 9351.396 .5934959349593496  .04065040650406504 0 0 0
    "BRA" "BIH" 14 2012        929 9351.396 .3821138211382114   .5772357723577236 0 0 0
    "BRA" "BIH" 14 2013      58609 9351.396 .3821138211382114   .5121951219512195 0 0 0
    "BRA" "BIH" 14 2014      98829 9351.396 .2845528455284553 .008130081300813009 0 0 0
    "BRA" "BLR" 15 2009     196491 10486.81 .6991869918699187  .18699186991869918 0 0 0
    "BRA" "BLR" 15 2010     200535 10486.81 .4796747967479675   .5121951219512195 0 0 0
    "BRA" "BLR" 15 2011     341227 10486.81 .5934959349593496  .10569105691056911 0 0 0
    "BRA" "BLR" 15 2012     477479 10486.81 .3821138211382114  .04065040650406504 0 0 0
    "BRA" "BLR" 15 2013     391021 10486.81 .3821138211382114   .6504065040650406 0 0 0
    "BRA" "BLR" 15 2014     171054 10486.81 .2845528455284553  .14634146341463414 0 0 0
    "BRA" "BOL" 16 2009  212279880 2266.878 .6991869918699187   .4959349593495935 1 0 0
    "BRA" "BOL" 16 2010  275644036 2266.878 .4796747967479675   .2032520325203252 1 0 0
    "BRA" "BOL" 16 2011  318939530 2266.878 .5934959349593496   .2926829268292683 1 0 0
    "BRA" "BOL" 16 2012  319023235 2266.878 .3821138211382114  .06504065040650407 1 0 0
    "BRA" "BOL" 16 2013  347332761 2266.878 .3821138211382114   .6747967479674797 1 0 0
    "BRA" "BOL" 16 2014  359426904 2266.878 .2845528455284553  .21138211382113822 1 0 0
    "BRA" "BRN" 17 2009      51951 17221.85 .6991869918699187   .6341463414634146 0 0 0
    "BRA" "BRN" 17 2010      39346 17221.85 .4796747967479675   .3902439024390244 0 0 0
    "BRA" "BRN" 17 2011      49940 17221.85 .5934959349593496  .16260162601626016 0 0 0
    "BRA" "BRN" 17 2012       1876 17221.85 .3821138211382114   .5853658536585366 0 0 0
    end
    format %ty ANO
    where:

    ID_EXP: ISO code for exporter
    ID_IMP: ISO code for importer
    ID_NUM: numerical id for ID_IMP
    EXP_NRB: the value of non-resource based exports
    DIST: the distance between ID_EXP and ID_IMP
    SCO_EXP, SCO_IMP: score of the strictness of environmental regulation for ID_EXP and ID_IMP respectively
    FRO, LIN, COL: equal to 1 if ID_EXP has a border with ID_IMP, speak the same language and have colonial ties respectively, 0 otherwise

    The commands I am running are the following:

    Code:
    tsset ID_NUM ANO, yearly
    qui tab ID_NUM, gen(fe_ID_NUM)
    ppml EXP_NRB DIST SCO_EXP SCO_IMP FRO LIN COL fe_ID_NUM*, cluster(DIST)
    The result of my regression is (I've omitted the dummies for clarity):

    Code:
    Exports “NRB” 
    (1)
    .
    DIST -0.00***
    (0.00)
    SCO_EXP -0.59
    (0.38)
    SCO_IMP 0.52**
    (0.24)
    FRO -0.13**
    (0.06)
    LIN -0.39***
    (0.03)
    Constant 20.33***
    (0.10)
    Observations 714
    R2 0.894
    Standard errors in parentheses * p < 0.10, ** p < 0.05, *** p < 0.01
    I have a couple of questions:

    1) Is the way I estimated my model correct? Because I have a 1xN model, I only estimated the importer fixed effects. Also, I clustered my errors based on the distance (should I use ln(distance)?). Is this correct?
    2) Regarding the creation of ID_NUM, I simply assigned a sequence {1,2,3,...,n} of values to the column of importers, in such a way that AGO = 1, ALB = 2, etc. Is this correct?
    3) Assuming the approach is correct, why would FRO and LIN be negative? In theory, countries that share a common border and/or language should trade more with each other, not less.

    I appreciate immensely any help.

    Kind regards,
    Pedro

    *ppml is a third-party command. You can install it using the following command:
    Code:
     ssc install ppml

  • #2
    Dear Pedro,

    1) looks fine
    2) looks fine, but consider using ppmlhdfe rather that ppml
    3) with these fixed effects, the coefficients of any variable that does not vary over time is not identified; so those estimate have no meaning

    Best wishes,

    Joao

    Comment


    • #3
      Hello, Joao! I really appreciate your help!

      Per your suggestion, I did the following:

      Code:
      
      . ppmlhdfe EXP_NRB SCO_IMP, absorb(fe_ID_NUM*#ANO FRO LIN COL SCO_EXP) cluster(DIST)
      (warning: absorbing 126 dimensions of fixed effects; check that you really want that)
      (dropped 24 observations that are either singletons or separated by a fixed effect)
      warning: dependent variable takes very low values after standardizing (3.2937e-08)
      Iteration 1:   deviance = 3.321e+10                  itol = 1.0e-04  subiters = 4   min(eta) =  -3.44  [p  ]
      Iteration 2:   deviance = 1.413e+10  eps = 1.35e+00  itol = 1.0e-04  subiters = 3   min(eta) =  -4.65  [   ]
      Iteration 3:   deviance = 1.056e+10  eps = 3.38e-01  itol = 1.0e-04  subiters = 3   min(eta) =  -5.73  [   ]
      Iteration 4:   deviance = 9.733e+09  eps = 8.48e-02  itol = 1.0e-04  subiters = 3   min(eta) =  -6.70  [   ]
      Iteration 5:   deviance = 9.531e+09  eps = 2.11e-02  itol = 1.0e-04  subiters = 3   min(eta) =  -7.70  [   ]
      Iteration 6:   deviance = 9.478e+09  eps = 5.59e-03  itol = 1.0e-04  subiters = 4   min(eta) =  -8.70  [p  ]
      Iteration 7:   deviance = 9.464e+09  eps = 1.45e-03  itol = 1.0e-04  subiters = 2   min(eta) =  -9.70  [   ]
      Iteration 8:   deviance = 9.461e+09  eps = 3.69e-04  itol = 1.0e-04  subiters = 2   min(eta) = -10.68  [   ]
      Iteration 9:   deviance = 9.460e+09  eps = 8.76e-05  itol = 1.0e-04  subiters = 2   min(eta) = -11.63  [   ]
      Iteration 10:  deviance = 9.460e+09  eps = 1.73e-05  itol = 1.0e-06  subiters = 2   min(eta) = -12.52  [   ]
      Iteration 11:  deviance = 9.460e+09  eps = 2.31e-06  itol = 1.0e-06  subiters = 4   min(eta) = -13.24  [p  ]
      Iteration 12:  deviance = 9.460e+09  eps = 2.45e-07  itol = 1.0e-06  subiters = 2   min(eta) = -13.67  [   ]
      Iteration 13:  deviance = 9.460e+09  eps = 1.16e-08  itol = 1.0e-08  subiters = 2   min(eta) = -13.79  [ s ]
      Iteration 14:  deviance = 9.460e+09  eps = 4.32e-11  itol = 1.0e-08  subiters = 5   min(eta) = -13.80  [ps ]
      Iteration 15:  deviance = 9.460e+09  eps = 7.30e-15  itol = 1.0e-10  subiters = 5   min(eta) = -13.80  [pso]
      Iteration 16:  deviance = 9.460e+09  eps = 3.53e-15  itol = 1.0e-10  subiters = 5   min(eta) = -13.80  [pso]
      ------------------------------------------------------------------------------------------------------------
      (legend: p: exact partial-out   s: exact solver   o: epsilon below tolerance)
      Converged in 16 iterations and 51 HDFE sub-iterations (tol = 1.0e-08)
      
      HDFE PPML regression                              No. of obs      =        708
      Absorbing 126 HDFE groups                         Residual df     =        117
      Statistics robust to heteroskedasticity           Wald chi2(1)    =       7.68
      Deviance             =   9459944363               Prob > chi2     =     0.0056
      Log pseudolikelihood =  -4729978122               Pseudo R2       =     0.9765
      
      Number of clusters (DIST)   =        118
                                       (Std. Err. adjusted for 118 clusters in DIST)
      ------------------------------------------------------------------------------
                   |               Robust
           EXP_NRB |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
           SCO_IMP |   .3647208   .1315784     2.77   0.006     .1068318    .6226097
             _cons |   20.59135   .0660034   311.97   0.000     20.46199    20.72072
      ------------------------------------------------------------------------------
      is this correct? (my regression looks so "empty"). Is there any way for me to retrieve the coefficient of SCO_EXP? It'd be really nice if I could estimate my regression in such a way to be able to analyze both SCO_EXP and SCO_IMP. Any ideas?

      Thanks a bunch!

      Comment


      • #4
        Dear Pedro,

        If you only have one importer and include exporter fixed effects, you cannot identify the effect of most traditional gravity variables. You need to think carefully about what you are doing and see if the data that you are using are suitable for your purpose.

        Best wishes,

        Joao

        Comment


        • #5
          Hello, Joao!

          Sorry for taking so long to answer, I was expanding my database. I'm currently working with the bilateral flows of 50 countries from 2010-2014. I've controlled for exporter and importer fixed effects and the results now make sense!

          I appreciate the help you gave. If I may ask another question, how do we interpret the pseudo R²? Also, when using ppml I should use the exponential form of the equation right? Should the independent continuous variables be linearized?

          Thanks!

          ​​​​​​​

          Comment


          • #6
            Dear Pedro Cunha,

            The R2 is the square of the correlation between y and the fitted values; that interpretation is also valid in the standard linear model.

            I am not sure if I understand your other questions, indeed ppml estimates an exponential model and the independent continuous variables ate often in logs.

            Best wishes,

            Joao

            Comment


            • #7
              Hello, Joao Santos Silva!

              I'm sorry to bother you again, but how can we analyze the equation for the gravity model? Considering we have as variables contig, comlang_off, colony, distwces, SCOi, SCOj (the first three are dummies), my model would be:
              Click image for larger version

Name:	Modelo.png
Views:	1
Size:	2.3 KB
ID:	1522497


              is that correct?

              If so, how do I analyze, for instance, the impact of SCOi, SCOj for example?

              Thanks!

              ¹If you want values to make it easier to explain, here's my results:

              Click image for larger version

Name:	image_16070.png
Views:	1
Size:	81.1 KB
ID:	1522498
              Last edited by Pedro Cunha; 29 Oct 2019, 15:52.

              Comment


              • #8
                Dear Pedro Cunha,

                I am afraid that is not correct. Please check the literature on the gravity equation for trade to learn about the specification of the model and the interpretation of the parameters.

                Best wishes,

                Joao

                Comment


                • #9
                  Dear Joao Santos Silva, I'm sorry for my last question: I should've done the research first before asking! Looking over your paper and many others, I believe my model would be:

                  Click image for larger version

Name:	Modelo'.png
Views:	1
Size:	2.5 KB
ID:	1522605


                  Is it correct now?

                  Is it mandatory for all my continuous independent variables to be linearized? I am asking that because I'd rather not apply the natural log to my variables of interest (the scores) if I don't have to.

                  Thanks again, Joao! You've been a lifesaver.

                  Comment


                  • #10
                    Dear Pedro Cunha,

                    The model looks good now and you do not have to log the scores.

                    Best wishes,

                    Joao

                    Comment


                    • #11
                      Thanks for all your help, Joao Santos Silva! My issues have all been solved for now. I'll return if anything new arises.

                      Have a great day!

                      Comment


                      • #12
                        Dear, Joao Santos Silva , I am sorry for reopening this thread, but I am revisiting my work in order to improve it and I have two questions:

                        i) It seems unanimous that we should use the logarithm of the distance and not the distance itself. However, wasn't the point of PPML to avoid applying the log transformation to the model? Or is the transformation only problematic for the dependent variable and the error term?

                        ii) regarding the marginal effects of the model: they are non-linear, correct? In order to find the effect of setting a dummy to 1, for example, I also have to calculate exp(x'B)? Is there a function that gives me the average partial effects of a model estimated using PPML?

                        Thank you!
                        Last edited by Pedro Cunha; 08 Feb 2021, 08:31.

                        Comment


                        • #13
                          Dear Pedro Cunha,

                          i) yes, taking logs is only a problem with the dependent variable; you should use log of distance as regressor.

                          ii) The coefficients have the usual interpretation as elasticities, or semi-elasticities in the case of dummies and other regressors that ate not logged.

                          Best wishes,

                          Joao

                          Comment


                          • #14
                            Dear Joao Santos Silva , thanks for the swift response! I have two last doubts, if you don't mind.

                            i) Is it a problem to not include fixed effects in my estimation or to include only destination fixed effects? I'm interested in the effects of two variables in particular: score_origin and score_destiny, which are two score variables ranging from 0-1. Since they are fixed for every origin-destiny pair in each year, I can't use pair-wise fixed effects, right? I'm also controlling for both origin and destiny logged GDPs.

                            ii) How would the interpretation of the semi-elasticity of these scores goes? Should I consider an increase of 1 standard-deviation? The usual 'an increase of 1 unity in x is related to a beta% increase/decrease in y" doesn't seem to apply here.

                            Thank you!

                            Kind regards,
                            Pedro
                            Last edited by Pedro Cunha; 08 Feb 2021, 21:13.

                            Comment


                            • #15
                              Dear Pedro Cunha,

                              i) Only you can answer that; that is a key part of your research.
                              ii) It is the usual interpretation, at least for reasonably small coefficients (say, smaller than 0.1).

                              Best wishes,

                              Joao

                              Comment

                              Working...
                              X