Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Gravity model with ppml command

    Hi all,

    I am a master student and I am trying to estimate a gravity model but I think I have some problems with that. I have a database with 1,663,200 observations referring to the exports and imports divided by 120 products of 22 regions in Italy with 35 countries for the period 1995-2012 . I want to include fixed effect of origin, products, time and destination.

    For this I have created dummy variables for each of these using:

    xi i.region*i.year, prefix(O*)
    xi i.product*i.year, prefix(P*)
    xi i.country*i.year, prefix(C*)

    At the end of this process I have nearly 4,000 variables. Now I am trying to use ppml including first separated fixed effects for each region, product, year and country and then the combined fixed effect.

    ppml dep. var. indep. variables(n°14) dummy variables for region, product, country and year, cluster(indicators of each triple link of region, country and product)

    The problem is that I have started the process yesterday with the first step and Stata is still elaborating iteration (at the moment n°245). Is this normal for the dimensions of the database or there is a problem with the formulation of the command or my estimation of the model?

    Thanks everybody

  • #2
    This is what i did and what Stata is doing:

    ppml EXP2 lnyy lnpilprocapite lndistcity lnsupsup BORDER UE Euro Lang Landlo_dest Landlo_orig lnmount Autonomy Presence Riconoscimento cris
    > is _OPR* _DDES* _PPRO*, cluster(pairid)
    note: checking the existence of the estimates
    WARNING: EXP2 has very large values, consider rescaling
    WARNING: lnyy has very large values, consider rescaling or recentering
    WARNING: lnpilprocapite has very large values, consider rescaling or recentering
    WARNING: lnsupsup has very large values, consider rescaling or recentering
    note: starting ppml estimation

    Iteration 1: deviance = 3.74e+12
    Iteration 2: deviance = 2.69e+12
    Iteration 3: deviance = 2.52e+12
    Iteration 4: deviance = 2.49e+12
    Iteration 5: deviance = 2.49e+12
    Iteration 6: deviance = 2.49e+12
    Iteration 7: deviance = 2.49e+12
    Iteration 8: deviance = 2.49e+12
    Iteration 9: deviance = 2.49e+12
    Iteration 10: deviance = 2.49e+12
    Iteration 11: deviance = 2.49e+12
    Iteration 12: deviance = 2.49e+12
    Iteration 13: deviance = 2.49e+12
    Iteration 14: deviance = 2.49e+12
    Iteration 15: deviance = 2.49e+12

    Starting from that deviance has the same value.

    Comment


    • #3
      Dear Giancarlo,

      The first warning that you get is that your dependent variable has very large values. If you rescale it (say, divide it by 1e3 or 1e6), the problem may go away. On a side note, you need to think about whether you are asking too much from your data.

      All the best,

      Joao

      Comment


      • #4
        Dear Joao,

        thank you for your reply. I will try to rescale dep variable (and independent ones I suppose too?) and see what happens. I tried different types of regression in order to estimate best the model. It was a suggestion of my professor the use of ppml and fixed effect in this way.

        Anyway I'm pretty new to Stata so I have no idea how long it takes such a process. If you have any suggestion of a more suitable command/process, that is more than welcome.

        Thank you

        Comment


        • #5
          Dear Giancarlo,

          Rescaling the variables should help (notice that the need to rescale is specific to Stata, with most other softwares rescaling is not needed). With such large model, estimation will always take some time.

          Good luck with your work,

          Joao

          Comment


          • #6
            Really thank you, I will let you know as I try again.

            Giancarlo

            Comment


            • #7
              Thank you Joao

              it does work now. Anyway Stata tells me my dep variable have non integer values (obviously after rescaling by 1e3). Is this a problem?

              Thank you

              Comment


              • #8
                No, not a problem at all; glad it worked!

                Joao

                Comment


                • #9
                  Missing reference here:

                  SJ-11-2 st0225 . . . . . . . . . . . . . . . poisson: Some convergence issues
                  (help ppml if installed) . . . J. M. C. Santos Silva and S. Tenreyro
                  Q2/11 SJ 11(2):207--212
                  provides improved Poisson regression by checking for the
                  existence of the estimates and providing two methods for
                  dropping regressors that cause nonexistence of estimates

                  Comment


                  • #10
                    Hi all,

                    I have a question. I estimate gravity model by PPML and OLS estimators. RESET test p-value in OLS is equal 0.287 and in PPML is equal 0.002.

                    my result is different from Silva & Tenreyro's study.why????????

                    OLS Estimation:

                    test fit2=0

                    ( 1) fit2 = 0

                    F( 1, 126) = 1.14
                    Prob > F = 0.2873


                    PPML Estimation:

                    test fit2=0

                    ( 1) fit2 = 0

                    chi2( 1) = 9.31
                    Prob > chi2 = 0.0023



                    Do you think my result is wrong??????????????

                    Comment


                    • #11
                      Dear Milad,

                      Without knowing what are the models you are estimating and the kind of data you have it is impossible to comment on this result. Maybe you should start a new thread for your question.

                      Best wishes,

                      Joao

                      Comment


                      • #12
                        Dear Joao,

                        Thank you for your reply.

                        My data is:

                        Dependent Variable: Export of Dates to EU countries in 2013
                        Exporters: 12 countries (Top Exporters such as Tunisia, Saudi Arabia, …)
                        Importers: 28 countries (European Union)
                        Year: 2013


                        ppml value lgdpx lgdpi lgdppx lgdppi ldis landli landlx, cluster(ldis)

                        note: checking the existence of the estimates

                        Number of regressors excluded to ensure that the estimates exist: 0
                        Number of observations excluded: 0

                        note: starting ppml estimation

                        Iteration 1: deviance = 430207
                        Iteration 2: deviance = 329435.5
                        Iteration 3: deviance = 314276.6
                        Iteration 4: deviance = 312864.6
                        Iteration 5: deviance = 312608.8
                        Iteration 6: deviance = 312569.2
                        Iteration 7: deviance = 312567.4
                        Iteration 8: deviance = 312567.4
                        Iteration 9: deviance = 312567.4


                        Number of parameters: 8
                        Number of observations: 334
                        Pseudo log-likelihood: -156873.22
                        R-squared: .5871964
                        Option strict is: off
                        (Std. Err. adjusted for 172 clusters in ldis)
                        --------------------------------------------------------------------------------------------------------------
                        | Robust
                        value | Coef. Std. Err. z P>|z| [95% Conf. Interval]
                        --------------------------------------------------------------------------------------------------------------
                        lgdpx | -.8473654 .1630437 -5.20 0.000 -1.166925 -.5278056
                        lgdpi | 1.025005 .1124659 9.11 0.000 .8045761 1.245434
                        lgdppx | .3310993 .1892837 1.75 0.080 -.0398899 .7020884
                        lgdppi | .2646335 .2307802 1.15 0.252 -.1876874 .7169544
                        ldis | -.6218436 .1669692 -3.72 0.000 -.9490972 -.29459
                        landli | .2435175 .3134024 0.78 0.437 -.3707399 .8577748
                        landlx | 4.922072 .6066905 8.11 0.000 3.73298 6.111163
                        _cons | 2.654003 1.778816 1.49 0.136 -.8324121 6.140418
                        --------------------------------------------------------------------------------------------------------------

                        Comment


                        • #13
                          This is a very atypical dataset because it surely does not have the zeros and the heteroskedasticity that characterize trade data and motivate the use of PPML. This, however, may explain why PPML has no advantage over OLS, but does not explain the superiority of OLS. Can you please show us the commands you used to perform the RESET tests and the OLS results?

                          Joao

                          Comment


                          • #14
                            dear Joao

                            yes I can.

                            reg lvalue lgdpx lgdpi lgdppx lgdppi ldis landli landlx, cluster( ldis)

                            Linear regression Number of obs = 174
                            F( 7, 126) = 25.61
                            Prob > F = 0.0000
                            R-squared = 0.3298
                            Root MSE = 1.988

                            (Std. Err. adjusted for 127 clusters in ldis)
                            ------------------------------------------------------------------------------
                            | Robust
                            lvalue | Coef. Std. Err. t P>|t| [95% Conf. Interval]
                            -------------+----------------------------------------------------------------
                            lgdpx | -.2589429 .1604681 -1.61 0.109 -.5765046 .0586188
                            lgdpi | .8058141 .1025243 7.86 0.000 .6029215 1.008707
                            lgdppx | .0675892 .2106754 0.32 0.749 -.3493312 .4845096
                            lgdppi | -.0855848 .2723315 -0.31 0.754 -.6245209 .4533512
                            ldis | -.966287 .1909489 -5.06 0.000 -1.344169 -.5884047
                            landli | .4627078 .472598 0.98 0.329 -.4725497 1.397965
                            landlx | 2.135198 .2848611 7.50 0.000 1.571466 2.69893
                            _cons| 6.874997 1.926631 3.57 0.001 3.062251 10.68774
                            ------------------------------------------------------------------------------


                            RESET test:


                            . predict fit, xb

                            . gen fit2=fit^2

                            . reg lvalue lgdpx lgdpi lgdppx lgdppi ldis landli landlx fit2, cluster( ldis)

                            . test fit2=0

                            ( 1) fit2 = 0

                            F( 1, 126) = 1.14
                            Prob > F = 0.2873

                            Comment


                            • #15
                              Thanks. You have more zeros than what I expected so OLS clearly is not a good choice. Can you show us the code used for the RESET in PPML?

                              Cheers,

                              Joao

                              Comment

                              Working...
                              X