Gravity model with ppml command

Giancarlo Carta

Join Date: Nov 2014

Posts: 13
#1

Gravity model with ppml command

18 Nov 2014, 07:16

Hi all,

I am a master student and I am trying to estimate a gravity model but I think I have some problems with that. I have a database with 1,663,200 observations referring to the exports and imports divided by 120 products of 22 regions in Italy with 35 countries for the period 1995-2012 . I want to include fixed effect of origin, products, time and destination.

For this I have created dummy variables for each of these using:

xi i.region*i.year, prefix(O*)
xi i.product*i.year, prefix(P*)
xi i.country*i.year, prefix(C*)

At the end of this process I have nearly 4,000 variables. Now I am trying to use ppml including first separated fixed effects for each region, product, year and country and then the combined fixed effect.

ppml dep. var. indep. variables(n°14) dummy variables for region, product, country and year, cluster(indicators of each triple link of region, country and product)

The problem is that I have started the process yesterday with the first step and Stata is still elaborating iteration (at the moment n°245). Is this normal for the dimensions of the database or there is a problem with the formulation of the command or my estimation of the model?

Thanks everybody
Tags: None
Giancarlo Carta

Join Date: Nov 2014

Posts: 13
#2

18 Nov 2014, 09:56

This is what i did and what Stata is doing:

ppml EXP2 lnyy lnpilprocapite lndistcity lnsupsup BORDER UE Euro Lang Landlo_dest Landlo_orig lnmount Autonomy Presence Riconoscimento cris
> is _OPR* _DDES* _PPRO*, cluster(pairid)
note: checking the existence of the estimates
WARNING: EXP2 has very large values, consider rescaling
WARNING: lnyy has very large values, consider rescaling or recentering
WARNING: lnpilprocapite has very large values, consider rescaling or recentering
WARNING: lnsupsup has very large values, consider rescaling or recentering
note: starting ppml estimation

Iteration 1: deviance = 3.74e+12
Iteration 2: deviance = 2.69e+12
Iteration 3: deviance = 2.52e+12
Iteration 4: deviance = 2.49e+12
Iteration 5: deviance = 2.49e+12
Iteration 6: deviance = 2.49e+12
Iteration 7: deviance = 2.49e+12
Iteration 8: deviance = 2.49e+12
Iteration 9: deviance = 2.49e+12
Iteration 10: deviance = 2.49e+12
Iteration 11: deviance = 2.49e+12
Iteration 12: deviance = 2.49e+12
Iteration 13: deviance = 2.49e+12
Iteration 14: deviance = 2.49e+12
Iteration 15: deviance = 2.49e+12

Starting from that deviance has the same value.
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3028
#3

18 Nov 2014, 12:29

Dear Giancarlo,

The first warning that you get is that your dependent variable has very large values. If you rescale it (say, divide it by 1e3 or 1e6), the problem may go away. On a side note, you need to think about whether you are asking too much from your data.

All the best,

Joao
1 like
Comment
Giancarlo Carta

Join Date: Nov 2014

Posts: 13
#4

18 Nov 2014, 13:00

Dear Joao,

thank you for your reply. I will try to rescale dep variable (and independent ones I suppose too?) and see what happens. I tried different types of regression in order to estimate best the model. It was a suggestion of my professor the use of ppml and fixed effect in this way.

Anyway I'm pretty new to Stata so I have no idea how long it takes such a process. If you have any suggestion of a more suitable command/process, that is more than welcome.

Thank you
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3028
#5

18 Nov 2014, 14:03

Dear Giancarlo,

Rescaling the variables should help (notice that the need to rescale is specific to Stata, with most other softwares rescaling is not needed). With such large model, estimation will always take some time.

Good luck with your work,

Joao
Comment
Giancarlo Carta

Join Date: Nov 2014

Posts: 13
#6

18 Nov 2014, 14:07

Really thank you, I will let you know as I try again.

Giancarlo
Comment
Giancarlo Carta

Join Date: Nov 2014

Posts: 13
#7

19 Nov 2014, 07:06

Thank you Joao

it does work now. Anyway Stata tells me my dep variable have non integer values (obviously after rescaling by 1e3). Is this a problem?

Thank you
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3028
#8

19 Nov 2014, 10:19

No, not a problem at all; glad it worked!

Joao
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35810
#9

19 Nov 2014, 12:16

Missing reference here:

SJ-11-2 st0225 . . . . . . . . . . . . . . . poisson: Some convergence issues
(help ppml if installed) . . . J. M. C. Santos Silva and S. Tenreyro
Q2/11 SJ 11(2):207--212
provides improved Poisson regression by checking for the
existence of the estimates and providing two methods for
dropping regressors that cause nonexistence of estimates
Comment
Milad Aminizadeh

Join Date: Jan 2016

Posts: 10
#10

09 Jan 2016, 17:41

Hi all,

I have a question. I estimate gravity model by PPML and OLS estimators. RESET test p-value in OLS is equal 0.287 and in PPML is equal 0.002.

my result is different from Silva & Tenreyro's study.why????????

OLS Estimation:

test fit2=0

( 1) fit2 = 0

F( 1, 126) = 1.14
Prob > F = 0.2873

PPML Estimation:

test fit2=0

( 1) fit2 = 0

chi2( 1) = 9.31
Prob > chi2 = 0.0023

Do you think my result is wrong??????????????
1 like
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3028
#11

10 Jan 2016, 02:55

Dear Milad,

Without knowing what are the models you are estimating and the kind of data you have it is impossible to comment on this result. Maybe you should start a new thread for your question.

Best wishes,

Joao
1 like
Comment
Milad Aminizadeh

Join Date: Jan 2016

Posts: 10
#12

10 Jan 2016, 19:44

Dear Joao,

Thank you for your reply.

My data is:

Dependent Variable: Export of Dates to EU countries in 2013
Exporters: 12 countries (Top Exporters such as Tunisia, Saudi Arabia, …)
Importers: 28 countries (European Union)
Year: 2013

ppml value lgdpx lgdpi lgdppx lgdppi ldis landli landlx, cluster(ldis)

note: checking the existence of the estimates

Number of regressors excluded to ensure that the estimates exist: 0
Number of observations excluded: 0

note: starting ppml estimation

Iteration 1: deviance = 430207
Iteration 2: deviance = 329435.5
Iteration 3: deviance = 314276.6
Iteration 4: deviance = 312864.6
Iteration 5: deviance = 312608.8
Iteration 6: deviance = 312569.2
Iteration 7: deviance = 312567.4
Iteration 8: deviance = 312567.4
Iteration 9: deviance = 312567.4

Number of parameters: 8
Number of observations: 334
Pseudo log-likelihood: -156873.22
R-squared: .5871964
Option strict is: off
(Std. Err. adjusted for 172 clusters in ldis)
--------------------------------------------------------------------------------------------------------------
| Robust
value | Coef. Std. Err. z P>|z| [95% Conf. Interval]
--------------------------------------------------------------------------------------------------------------
lgdpx | -.8473654 .1630437 -5.20 0.000 -1.166925 -.5278056
lgdpi | 1.025005 .1124659 9.11 0.000 .8045761 1.245434
lgdppx | .3310993 .1892837 1.75 0.080 -.0398899 .7020884
lgdppi | .2646335 .2307802 1.15 0.252 -.1876874 .7169544
ldis | -.6218436 .1669692 -3.72 0.000 -.9490972 -.29459
landli | .2435175 .3134024 0.78 0.437 -.3707399 .8577748
landlx | 4.922072 .6066905 8.11 0.000 3.73298 6.111163
_cons | 2.654003 1.778816 1.49 0.136 -.8324121 6.140418
--------------------------------------------------------------------------------------------------------------
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3028
#13

11 Jan 2016, 00:31

This is a very atypical dataset because it surely does not have the zeros and the heteroskedasticity that characterize trade data and motivate the use of PPML. This, however, may explain why PPML has no advantage over OLS, but does not explain the superiority of OLS. Can you please show us the commands you used to perform the RESET tests and the OLS results?

Joao
Comment
Milad Aminizadeh

Join Date: Jan 2016

Posts: 10
#14

11 Jan 2016, 03:06

dear Joao

yes I can.

reg lvalue lgdpx lgdpi lgdppx lgdppi ldis landli landlx, cluster( ldis)

Linear regression Number of obs = 174
F( 7, 126) = 25.61
Prob > F = 0.0000
R-squared = 0.3298
Root MSE = 1.988

(Std. Err. adjusted for 127 clusters in ldis)
------------------------------------------------------------------------------
| Robust
lvalue | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lgdpx | -.2589429 .1604681 -1.61 0.109 -.5765046 .0586188
lgdpi | .8058141 .1025243 7.86 0.000 .6029215 1.008707
lgdppx | .0675892 .2106754 0.32 0.749 -.3493312 .4845096
lgdppi | -.0855848 .2723315 -0.31 0.754 -.6245209 .4533512
ldis | -.966287 .1909489 -5.06 0.000 -1.344169 -.5884047
landli | .4627078 .472598 0.98 0.329 -.4725497 1.397965
landlx | 2.135198 .2848611 7.50 0.000 1.571466 2.69893
_cons| 6.874997 1.926631 3.57 0.001 3.062251 10.68774
------------------------------------------------------------------------------

RESET test:

. predict fit, xb

. gen fit2=fit^2

. reg lvalue lgdpx lgdpi lgdppx lgdppi ldis landli landlx fit2, cluster( ldis)

. test fit2=0

( 1) fit2 = 0

F( 1, 126) = 1.14
Prob > F = 0.2873
1 like
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3028
#15

11 Jan 2016, 03:36

Thanks. You have more zeros than what I expected so OLS clearly is not a good choice. Can you show us the code used for the RESET in PPML?

Cheers,

Joao
1 like
Comment

Announcement

Gravity model with ppml command

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment