Possible misspecification in gravity model (PPML, RESET test)

VitoStefano Bramante

Join Date: Dec 2015

Posts: 5
#1

Possible misspecification in gravity model (PPML, RESET test)

23 Dec 2015, 14:18

Dear all,
Brief overview: I' m trying to estimate the impact of intrawar presence(1) interwar presence(2) and economic sanctions(3) on exports, using a gravity model.
When it comes to estimate gravity equation, PPML is the new benchmark. All previous studies on the topic though, use OLS, therefore it might be interesting to see if conventionl wisdom holds, using this new approach. That's why before any inference on my main 3 variables of interest, I am running a sensitivity analysis to compare different OLS specification with different PPML specification.

What's the problem ?
My main concern is about the PPML with time-varying country dummies specification.

To be more specific I use dummies for every origin country and every destination country, on a three year basis (following a previous paper by Ruiz and Villarubia, which also use OLS, not PPML). To be more explicit, Germany has 14 dummies in total: Germany as EXporter for the years 1989-1991, Germany as IMporter for the years 1989-2001, Germany as EXporter for the years 1992-1995 and so on...
I need to use a 3 years-country dummy because my dataset is made of 89 countries (covering 92% of World Export) for a 21 years time-span, from 1989 to 2009, resulting in a balanced panel of 164472 observations, which would require 89x21x2 = 3738 dummies on a 1 year base, way too much for the computational power at my disposal.

What's my Stata code ?

I create the dummies using

Code:

*where year3 is categorical from 1 to 7 for the years *origin is the origin country id and destination is the destination country id xi, prefix(_G) noomit i.origin*i.year3 i.destination*i.year3

I drop time invariant country-dummies and time-dummies automatically created by the previous code and i run PPML

Code:

drop _Gorigin* _Gyear* _Gdestin* ppml export2 lndistwces contig comlang_off colony _G* if year < 2010, cluster(dyad) *Where: export2 is export in billion of 2005 US$ (to allow a quicker computation) FROM Feenstra/UN comtrade *lndistwces is weighted distance from CEPII *contig is 1 for contiguity from CEPII *comlang_off is 1 for a common language from CEPII * colony is 1 for previous colonial ties from CEPII

Then I run a RESET test:

Code:

predict XB,xb gen XB2 = XB^2 quietly ppml export2 lndistwces contig comlang_off colony XB2 _G* if year < 2010, keep cluster(dyad) test XB2 = 0

Results are as follows

Code:

Number of parameters: 1243 Number of observations: 164472 Pseudo log-likelihood: -61261.918 R-squared: .91971331 Option strict is: off (Std. Err. adjusted for 7,832 clusters in dyad) -------------------------------------------------------------------------------- | Robust export2 | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------------+---------------------------------------------------------------- lndistwces | -.7634372 .0258945 -29.48 0.000 -.8141894 -.712685 contig | .3082213 .0659453 4.67 0.000 .1789708 .4374718 comlang_off | .2199701 .0614091 3.58 0.000 .0996105 .3403298 colony | -.0989539 .1018637 -0.97 0.331 -.298603 .1006952 test XB2 = 0 ( 1) XB2 = 0 chi2( 1) = 6.23 Prob > chi2 = 0.0125

From a qualitative point of view results are in line with previous studies, but the RESET test p-value is a bit too low.

My plan is to run the same model including my variables of interest (intrawar, interwar, economic sanctions).
And to repeat everything subsetting for Heterogenous products, Reference Priced products and Differentiated Products following Rauch classification, to see what products are more sensitive to unstable conditions.

My questions are:

May the RESET test alone undermine the reliability of my results ?
May the RESET test of the others models undermine the reliability of those those results too ?
Am I overthinking this ?

Any comment on the code, on the RESET test in particular and on the project in general, would be much appreciated.

Last edited by VitoStefano Bramante; 23 Dec 2015, 14:57.
Tags: gravity model, PPML, reset test
Joao Santos Silva

Join Date: Apr 2014

Posts: 3014
#2

24 Dec 2015, 10:30

Hi there,

I am glad you say that PPML is the new benchmark [IMG]file:///C:\Users\js0072\AppData\Local\Temp\msohtmlclip1\01 \clip_image001.png[/IMG].

I do not see many reasons to worry; your model passes the RESET test at 1%. Given your sample size, the number of regressors, and the fact that you are using the 3-year dummies, I think the result is quite reassuring.

It may well be the case that some of the models you will estimate do not pass the RESET but the gravity equation will have to be an exponential model and so you cannot change that. You may, however, consider changing the set of regressors you are using, for example by including cross products of your regressors. Anyway, that will have to be done on a case-by-case basis.

Best wishes,

Joao
Comment
VitoStefano Bramante

Join Date: Dec 2015

Posts: 5
#3

24 Dec 2015, 12:39

Thank you very much prof. Santos Silva, reassuring indeed. In order to make my case I run a sensitivity analysis showing results for Pooled data (no dummies), Year dumies + Invariant country dummies and Time varying country dummies, for both OLS and PPML. OLS always fail the RESET test, while PPML pass the test in the first two specification showing a p-value well below the 90% confidence interval ( > 0.1).

I would like to ask you also if in this case a simple ratio between exports and predicted export (predicted by ppml using "predict variable, mu") is enough to have a measure of trade potential.
It should be something like this:

Code:

*after PPML estimation predict predicted_export, mu gen export_potential = export2/predicted_export

regards
vito
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3014
#4

24 Dec 2015, 18:22

Dear Vito,

Indeed that gives you the ratio between exports and predicted export but I am not entirely sure what you mean by trade potential.

Joao
Comment
VitoStefano Bramante

Join Date: Dec 2015

Posts: 5
#5

25 Dec 2015, 11:00

Dear professor Santos Silva,
by trade potential I mean what defined by De Benedictis and Vicarelli as (apologies for the size of the picture)

Of course in my case export is not in log, so a simpe ratio between actual export and predicted export should do the trick.
As I understand it (might be very wrong though) this should be somewhat similar to what you call overtrading/undertrading but the code you provide here should not be applied to ppml but to xpqml only.
Thanks for your patience,
vito
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3014
#6

25 Dec 2015, 11:24

I see what you mean. Yes, the concept is similar to the overtrading/undertrading, but that is more like a residual.

All best wishes,

Joao
Comment
Harry Stead

Join Date: Mar 2018

Posts: 7
#7

09 Apr 2018, 06:37

Hi Joao,

I am currently writing my final thesis where i am investigating the effect the European Monetary union has on bilateral trade flows within Europe.
I first use OLS to estimate a very basic specification as can be seen below:
xtreg lexp1to2 ldist lgdp1 lgdp2 lpop1 lpop2 border comlang colony landl emu dyear*, vce(robust)

I then re-run this regression with the inclusion of importer year and exporter year fixed effects
I then have a final regression which includes importer year, exporter year and dyadic fixed effects
After i have run these regressions i then re-do them using PPML as suggested in your work in 2006.

In order to identify which models are misspecified, i was going to run the ramsey RESET test on all the regression. However, i have been advised that this test usually suggests the model is misspecified when pair fixed effects are used. Therefore i have been recommended to look at the Mamu test as suggested in your work in 2006. I have read your work, as well as that by Head and Mayer in the Gravity equations textbook, but i am still confused as to how i should implement this test. Please, would you be able to shed some light on how i implement this on stata?

Thankyou in advance for any advice you may be able to provide!!

Kind regards, Harry
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3014
#8

09 Apr 2018, 13:37

Dear Harry,

The "Mamu" test (actually it is Park's test), is not particularly relevant in this context. If you want to implement it anyway, we describe it in detail in the "log of Gravity" paper.

The RESET test may also not give you what you want. In a way, the more variables your model has, the more likely the RESET is to reject the null. So, by including more fixed effects you are making the RESET more demanding. That is, if the model without fixed effects passes the RESET and the model with FE doesn't, that does not mean you should prefer the model without FE. In other words, the RESET cannot be used to choose between models with different sets of regressors.

In any case, we know that if the models estimated by OLS are correctly specified their results should be similar to those obtained by Poisson. If the results are different you should prefer Poisson, and I would add that if the results are similar you should still prefer Poisson ;-)

Best wishes,

Joao
Comment
Harry Stead

Join Date: Mar 2018

Posts: 7
#9

10 Apr 2018, 05:06

Dear Joao,

Thankyou for your reply, it is very much appreciated!!

So just to clear things up for myself:
Neither the Ramsey RESET test or mamu test is relevant in my context in trying to test for model misspecification.
Therefore, in my context, a diagnostic test for model misspecification is not required?

I have quite different results for PPML and OLS. Therefore i should just prefer the PPML estimation due to the various reason discussed in your paper in 2006?

Kind regards, Harry Stead
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3014
#10

10 Apr 2018, 13:49

It is not that the RESET is irrelevant but that the RESET cannot be used to choose between specifications with different sets of fixed effects. If you are doing an undergraduate thesis, I suggest that you do not worry about the specification tests and focus on making sure that you understand well what you are doing.

Best wishes,

Joao
PS: Yes, just go with PPML, you cannot go wrong ;-)
Comment
Harry Stead

Join Date: Mar 2018

Posts: 7
#11

11 Apr 2018, 08:29

Okay, thankyou for clearing that up for me!!

Kind regards, Harry Stead
Comment

Isabel Cour

Join Date: Sep 2017
Posts: 28

#12

06 May 2018, 07:01

Dear Joao,

Using a gravity framework, I am estimating a panel with one importer country and 6 exporter countries over 16 year. The problem is that even when I rescale the variable still i have the WARNING and it must rescale again. I rescaled by using

Code:

gdp_exp  >>>> gen gdp_2 = (GDP_e/1000000)
gdp_imp >>>>> gen gdp_1 = (GDP_m/1000000)

Then I use log for each one.

Code:

note: checking the existence of the estimates
WARNING: lngdp_1 has very large values, consider rescaling or recentering
WARNING: lngdp_2 has very large values, consider rescaling  or recentering
note: starting ppml estimation
note: lnimpo has noninteger values

Iteration 1:   deviance =  .7475925
Iteration 2:   deviance =  .7456397
Iteration 3:   deviance =  .7456397

Number of parameters: 22
Number of observations: 90
Number of observations dropped: 0
Pseudo log-likelihood: -221.95536
R-squared: .94337307
------------------------------------------------------------------------------
             |               Robust
      lnimpo |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
  lnintus2    |   .0245006   .0112253     2.18   0.029     .0024993    .0465019
     lngdp_2 |   .0206733   .0296131     0.70   0.485    -.0373673     .078714
    lntariff |  -.0245235   .0063309    -3.87   0.000    -.0369319   -.0121151
  exporter_1 |   .0417124   .0714348     0.58   0.559    -.0982972     .181722
  exporter_2 |   .0600812   .1112002     0.54   0.589    -.1578672    .2780297
  exporter_3 |   .0512052   .0475159     1.08   0.281    -.0419242    .1443345
  exporter_4 |  -.0769288   .0552426    -1.39   0.164    -.1852023    .0313446
  exporter_5 |    .006909   .0365235     0.19   0.850    -.0646758    .0784937
      year_3 |  -.0235387   .0191046    -1.23   0.218    -.0609831    .0139057
      year_4 |  -.0152107   .0184645    -0.82   0.410    -.0514005    .0209791
      year_5 |  -.0008343   .0136604    -0.06   0.951    -.0276082    .0259396
      year_6 |  -.0048332    .010407    -0.46   0.642    -.0252304    .0155641
      year_7 |   -.007946   .0081831    -0.97   0.332    -.0239845    .0080925
      year_8 |   .0003118   .0090628     0.03   0.973     -.017451    .0180746
      year_9 |  -.0022143   .0074474    -0.30   0.766    -.0168109    .0123823
     year_10 |  -.0101576   .0082125    -1.24   0.216    -.0262539    .0059387
     year_11 |   .0015602   .0106269     0.15   0.883    -.0192683    .0223886
     year_12 |  -.0095228   .0101023    -0.94   0.346    -.0293229    .0102774
     year_13 |   .0038558   .0119044     0.32   0.746    -.0194764     .027188
     year_14 |   .0020576   .0120742     0.17   0.865    -.0216073    .0257225
     year_15 |   .0024177   .0122576     0.20   0.844    -.0216068    .0264422
       _cons |    2.77543   .2830378     9.81   0.000     2.220686    3.330174
------------------------------------------------------------------------------
Number of regressors dropped to ensure that the estimates exist: 7
Dropped variables:  lngdp_1 lndist exporter_6 importer_1 year_1 year_2 year_16
Option strict is off

Any comments is welcome, thank you in advance,

Regards,

Comment

Joao Santos Silva

Join Date: Apr 2014

Posts: 3014
#13

06 May 2018, 14:41

Dear Isabel,

The warning is displayed when regressors have values larger than log of one million (in absolute value) so you can get summary statistics of the GDP variables and decide on the appropriate scale. Anyway, if you get convergence (which you do) you can ignore the warning.

Best wishes,

Joao
Comment
Isabel Cour

Join Date: Sep 2017

Posts: 28
#14

06 May 2018, 15:17

Thank you very much Joao for the answer. I have one more question, I did pooled OLS and then PPML. How can I check PPML result?. When I compare result PPML give me a better result so far.
Comment
Isabel Cour

Join Date: Sep 2017

Posts: 28
#15

06 May 2018, 15:21

is PPML addressing endogeneity issues?
Comment

Announcement

Possible misspecification in gravity model (PPML, RESET test)

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment