PPML Gravity Model help requested

majid lateef

Join Date: Apr 2016

Posts: 9
#1

PPML Gravity Model help requested

27 Apr 2016, 08:19

I am trying to find Impact of a bilateral trade agreement on agriculture trade. I am really inspired by work of Santos Silva& Tenreyro (2006) on PPML estimator and willing to use this in my research. My research contains agricultural export of Pakistan to 50 countries for 14 years so I have added two dummy variables of interest and willing to include time fixed effects.
I have following observations:
My RESET results are not looking consistent, please guide me is there any possible reason for this and any suggestions to correct it.

My data do not contain zero value of trade, in this case, do I still need to prefer PPML over OLS?

Any other suggestions & recommendations are welcomed.
Thanking you in anticipation.

BY PPML:

predict fit, xb
(3 missing values generated)

. gen fit2=fit^2
(3 missing values generated)

. ppml agriexppk pc fta lgdpimpr lpopimpr ldistcap ler lagriland Comcol contig colony fit2 F_*

note: checking the existence of the estimates
WARNING: agriexppk has very large values, consider rescaling
WARNING: lgdpimpr has very large values, consider rescaling or recentering
WARNING: lpopimpr has very large values, consider rescaling or recentering
WARNING: lagriland has very large values, consider rescaling or recentering
WARNING: fit2 has very large values, consider rescaling or recentering

Number of regressors excluded to ensure that the estimates exist: 0
Number of observations excluded: 0

note: starting ppml estimation
note: agriexppk has noninteger values

Iteration 1: deviance = 2.23e+07
Iteration 2: deviance = 2.02e+07
Iteration 3: deviance = 2.01e+07
Iteration 4: deviance = 2.01e+07
Iteration 5: deviance = 2.01e+07
Iteration 6: deviance = 2.01e+07

Number of parameters: 25
Number of observations: 697
Pseudo log-likelihood: -10071968
R-squared: .77763997
Option strict is: off
------------------------------------------------------------------------------
| Semirobust
agriexppk | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
pc | -4.023749 .7586098 -5.30 0.000 -5.510597 -2.536901
fta | -2.207587 .4853811 -4.55 0.000 -3.158916 -1.256257
lgdpimpr | -.4123445 .0944844 -4.36 0.000 -.5975306 -.2271584
lpopimpr | 1.148529 .2470537 4.65 0.000 .6643125 1.632745
ldistcap | 3.48049 .6613658 5.26 0.000 2.184237 4.776743
ler | -.2522311 .0602458 -4.19 0.000 -.3703108 -.1341514
lagriland | -.6025691 .1250249 -4.82 0.000 -.8476133 -.3575249
Comcol | -1.056128 .2360891 -4.47 0.000 -1.518854 -.5934016
contig | 1.961385 .3515048 5.58 0.000 1.272448 2.650321
colony | -2.435843 .5272419 -4.62 0.000 -3.469218 -1.402468
fit2 | .1519963 .0219224 6.93 0.000 .1090292 .1949633
F_1Year_2002 | -.1669563 .1991266 -0.84 0.402 -.5572373 .2233247
F_1Year_2003 | -.6412399 .2368302 -2.71 0.007 -1.105419 -.1770611
F_1Year_2004 | -.677247 .2397869 -2.82 0.005 -1.147221 -.2072734
F_1Year_2005 | -1.242436 .3274378 -3.79 0.000 -1.884203 -.60067
F_1Year_2006 | -1.243469 .3439022 -3.62 0.000 -1.917505 -.5694329
F_1Year_2007 | -1.479224 .3537169 -4.18 0.000 -2.172496 -.7859518
F_1Year_2008 | -2.61203 .5906815 -4.42 0.000 -3.769745 -1.454316
F_1Year_2009 | -2.129372 .4768287 -4.47 0.000 -3.063939 -1.194805
F_1Year_2010 | -2.591113 .552161 -4.69 0.000 -3.673329 -1.508897
F_1Year_2011 | -3.527925 .7377195 -4.78 0.000 -4.973829 -2.082021
F_1Year_2012 | -3.285037 .6943818 -4.73 0.000 -4.646 -1.924074
F_1Year_2013 | -3.496063 .7346149 -4.76 0.000 -4.935881 -2.056244
F_1Year_2014 | -3.385662 .7185074 -4.71 0.000 -4.793911 -1.977414
_cons | -34.60186 7.886442 -4.39 0.000 -50.059 -19.14472
------------------------------------------------------------------------------

. test fit2=0

( 1) fit2 = 0

chi2( 1) = 48.07
Prob > chi2 = 0.0000

BY OLS:

predict fit, xb
(3 missing values generated)

. gen fit2=fit^2
(3 missing values generated)

. regres lagriexppk pc fta lgdpimpr lpopimpr ldistcap ler lagriland Comcol contig colony fit2 F_*, robust

Linear regression Number of obs = 697
F( 24, 672) = 33.25
Prob > F = 0.0000
R-squared = 0.4673
Root MSE = 1.3866

------------------------------------------------------------------------------
| Robust
lagriexppk | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
pc | 9.969214 2.239035 4.45 0.000 5.572868 14.36556
fta | 4.915393 1.075384 4.57 0.000 2.803876 7.026909
lgdpimpr | 1.288349 .3207756 4.02 0.000 .658506 1.918192
lpopimpr | -.7609452 .1952006 -3.90 0.000 -1.144222 -.3776688
ldistcap | -10.49277 2.335135 -4.49 0.000 -15.07781 -5.907733
ler | .438851 .1066996 4.11 0.000 .2293463 .6483556
lagriland | .9132004 .2022078 4.52 0.000 .5161653 1.310236
Comcol | 5.458224 1.22512 4.46 0.000 3.0527 7.863749
contig | -7.6576 1.751437 -4.37 0.000 -11.09655 -4.218653
colony | 6.331565 1.34494 4.71 0.000 3.690775 8.972355
fit2 | -.1536268 .0397374 -3.87 0.000 -.2316511 -.0756025
F_1Year_2002 | 3.049975 .9281477 3.29 0.001 1.227557 4.872394
F_1Year_2003 | 4.174452 1.14771 3.64 0.000 1.920923 6.427982
F_1Year_2004 | 4.756569 1.262242 3.77 0.000 2.278156 7.234981
F_1Year_2005 | 6.016777 1.519713 3.96 0.000 3.032819 9.000735
F_1Year_2006 | 6.08904 1.534441 3.97 0.000 3.076164 9.101916
F_1Year_2007 | 6.755335 1.665978 4.05 0.000 3.484186 10.02648
F_1Year_2008 | 8.696563 2.076598 4.19 0.000 4.619163 12.77396
F_1Year_2009 | 8.503707 2.044425 4.16 0.000 4.489477 12.51794
F_1Year_2010 | 9.50537 2.256293 4.21 0.000 5.075138 13.9356
F_1Year_2011 | 11.17181 2.611969 4.28 0.000 6.043205 16.30041
F_1Year_2012 | 10.79001 2.529944 4.26 0.000 5.822469 15.75756
F_1Year_2013 | 11.88694 2.770768 4.29 0.000 6.446531 17.32734
F_1Year_2014 | 11.30623 2.645622 4.27 0.000 6.11155 16.50091
_cons | 107.5101 21.09863 5.10 0.000 66.08291 148.9373
------------------------------------------------------------------------------

. test fit2=0

( 1) fit2 = 0

F( 1, 672) = 14.95
Prob > F = 0.0001
Tags: gravity model, PPML, reset
Joao Santos Silva

Join Date: Apr 2014

Posts: 3011
#2

27 Apr 2016, 12:31

Dear Majid,

I am glad you liked the "Log of Gravity". About your questions:

1. Indeed your models appear to be failing the RESET. First of all, please check that you are performing the test correctly; our webpage has code illustrating how to do it. If the model really fails the test, you need to think about ways of improving your model. Maybe you can add other important regressors? Or maybe you just need to include interactions (or cross-products) of the regressors that you already have.

2. The main reason to prefer PPML is not the zeros but the heteroskedasticity of trade data. So, even without zeros, PPML is generally preferable.

Finally, note that your sample is quite small, so try to keep your model as parsimonious as possible.

Best regards,

Joao
1 like
Comment
majid lateef

Join Date: Apr 2016

Posts: 9
#3

28 Apr 2016, 00:34

Respected Joao,
I am really very thankful to you for your kindness and appreciate that you are active on this forum to help the people around the globe.
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3011
#4

28 Apr 2016, 11:20

My pleasure!

Joao
1 like
Comment
majid lateef

Join Date: Apr 2016

Posts: 9
#5

29 Apr 2017, 03:37

Hi,
Please have a look on the following stata results.
My question is why stata exclude some variables while performing the PPML, what is its meaning and how it can be improved if someone can't afford to exlude that variables? can we say that these variables are not significant?
Thank you so much.
Majid Lateef

ppml agriexp000 lgdpimp lpopimp ldistcap ler lagriland Comcol comlang_off pc fta

note: checking the existence of the estimates
WARNING: agriexp000 has very large values, consider rescaling
WARNING: lgdpimp has very large values, consider rescaling or recentering
WARNING: lpopimp has very large values, consider rescaling or recentering
WARNING: lagriland has very large values, consider rescaling or recentering

Number of regressors excluded to ensure that the estimates exist: 2
Excluded regressors: pc fta
Number of observations excluded: 0

note: starting ppml estimation
note: agriexp000 has noninteger values

Iteration 1: deviance = 4.69e+08
Iteration 2: deviance = 3.25e+08
Iteration 3: deviance = 3.11e+08
Iteration 4: deviance = 3.11e+08
Iteration 5: deviance = 3.11e+08
Iteration 6: deviance = 3.11e+08
Iteration 7: deviance = 3.11e+08

Number of parameters: 8
Number of observations: 1473
Pseudo log-likelihood: -1.554e+08
R-squared: .82517599
Option strict is: off
------------------------------------------------------------------------------
| Robust
agriexp000 | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lgdpimp | .8985808 .0316378 28.40 0.000 .8365718 .9605899
lpopimp | .2775649 .0521369 5.32 0.000 .1753785 .3797514
ldistcap | .2061797 .0689225 2.99 0.003 .071094 .3412654
ler | .1871443 .0128657 14.55 0.000 .161928 .2123605
lagriland | -.3276343 .0263171 -12.45 0.000 -.3792148 -.2760538
Comcol | .071979 .2206157 0.33 0.744 -.3604198 .5043779
comlang_off | .3279726 .1464221 2.24 0.025 .0409905 .6149548
_cons | -14.57945 .8334455 -17.49 0.000 -16.21298 -12.94593
------------------------------------------------------------------------------

Last edited by majid lateef; 29 Apr 2017, 03:47.
Comment
majid lateef

Join Date: Apr 2016

Posts: 9
#6

29 Apr 2017, 03:46

Hi,
Please have a look on the following stata results.
My question is why stata exclude some variables while performing the PPML, what is its meaning and how it can be improved if someone can't afford to exlude that variables? can we say that these variables are not significant?
Thank you so much.
Majid Lateef

ppml agriexp000 lgdpimp lpopimp ldistcap ler lagriland Comcol comlang_off pc fta

note: checking the existence of the estimates
WARNING: agriexp000 has very large values, consider rescaling
WARNING: lgdpimp has very large values, consider rescaling or recentering
WARNING: lpopimp has very large values, consider rescaling or recentering
WARNING: lagriland has very large values, consider rescaling or recentering

Number of regressors excluded to ensure that the estimates exist: 2
Excluded regressors: pc fta
Number of observations excluded: 0

note: starting ppml estimation
note: agriexp000 has noninteger values

Iteration 1: deviance = 4.69e+08
Iteration 2: deviance = 3.25e+08
Iteration 3: deviance = 3.11e+08
Iteration 4: deviance = 3.11e+08
Iteration 5: deviance = 3.11e+08
Iteration 6: deviance = 3.11e+08
Iteration 7: deviance = 3.11e+08

Number of parameters: 8
Number of observations: 1473
Pseudo log-likelihood: -1.554e+08
R-squared: .82517599
Option strict is: off
------------------------------------------------------------------------------
| Robust
agriexp000 | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lgdpimp | .8985808 .0316378 28.40 0.000 .8365718 .9605899
lpopimp | .2775649 .0521369 5.32 0.000 .1753785 .3797514
ldistcap | .2061797 .0689225 2.99 0.003 .071094 .3412654
ler | .1871443 .0128657 14.55 0.000 .161928 .2123605
lagriland | -.3276343 .0263171 -12.45 0.000 -.3792148 -.2760538
Comcol | .071979 .2206157 0.33 0.744 -.3604198 .5043779
comlang_off | .3279726 .1464221 2.24 0.025 .0409905 .6149548
_cons | -14.57945 .8334455 -17.49 0.000 -16.21298 -12.94593
------------------------------------------------------------------------------
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3011
#7

29 Apr 2017, 04:09

Dear Majid,

Can you please let us know whether these results were obtained with the latest version of ppml (available from ssc). If they are not, please update ppml and post the new results.

Best wishes,

Joao
Comment
majid lateef

Join Date: Apr 2016

Posts: 9
#8

29 Apr 2017, 04:23

Dear Joao,
These results obtained after running the following command.
ssc install ppml
checking ppml consistency and verifying not already installed...
all files already exist and are up to date.

Thanks for your quick response.
Majid
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3011
#9

29 Apr 2017, 05:15

Then I believe that both variables that are dropped are perfect predictors and have to be dropped. What surprises me is that no observations are dropped. Are pc and fta dummies?

Joao
Comment
majid lateef

Join Date: Apr 2016

Posts: 9
#10

29 Apr 2017, 07:39

Yes, Both are dummy variables.
Majid
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3011
#11

29 Apr 2017, 07:58

If you can post your data or send it to me by email I'll have a look at it.

Best wishes,

Joao
Comment
majid lateef

Join Date: Apr 2016

Posts: 9
#12

29 Apr 2017, 09:01

Yes sure, please tell me your email address to send you data file.
Thanking you in anticipation,
Majid Lateef
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3011
#13

29 Apr 2017, 09:30

[email protected]
Comment
majid lateef

Join Date: Apr 2016

Posts: 9
#14

29 Apr 2017, 22:47

I have sent an email to you.Thank you so much.
Majid Lateef
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3011
#15

01 May 2017, 02:02

Dear Majid,

Thank you for sending the data. Those variables are dropped because other variables have missing values when they are equal to 1. So, after dropping the missings, those dummies have no variation. In a next update of -ppml- I'll try to find a way of providing more helpful warnings in these cases.

Best wishes,

Joao
Comment

Announcement

PPML Gravity Model help requested

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment