Help with gravity equation estimated using PPML (Stata 14.2)

Ridwan Sheikh

Join Date: Apr 2021

Posts: 172
#16

21 Feb 2022, 10:32

Dear Joao Santos Silva

My model is Y = exp(b₁ln X1 + b₂ln X2 + b₃lnX3 + a₁ Z1+ a₂ Z2+ a₃Z3 + k_i+ k_e) e_it

i am using ppmlhdfe [because it is a large panel (unbalanced) with so many sectors, countries (exporter, importer) and time-period feixed effects].

The RHS variables X1, X2, X3 are logarithmic, whereas Z1, Z2, Z3 are in levels (untransformed). Also k_iand k_eare importer and exporter fixed effects and e_it is error term.
(1) From the discussion above, i learnt that b₁, b₂, b₃ are elasticities and a₁ , a₂, a₃ are semi-elasticities. Have i understood that right and it still valid in case of ppmlhdfe estimator ?
(2) If that is right ! how do we get the elasticity interpretation of Z1, Z2, Z3 coefficients. Do we need to do the transformation like [exp(a₁)- 1] x 100 and similarily for Z2, Z3 coefficients respectively to obtain the elasticities of Z1, Z2, Z3 and interpret them in percentage terms.

How to perform the RESET test ?
I have learned from your web-page that we need to do something like this in my setting of ppmlhdfe estimator .
* Run the following ppmlhdfe estimator

Code:

ppmlhdfe Y lnX1 lnX2 lnX3 Z1 Z2 Z3 k_ik_e, robust

(i am including importer and exporter fixed effects- k_iand k_ealso)

* Get fitted values (of the linear index, not of Y)

Code:

predict fit, xb

(3) Does the above code automatically obtains the fitted values of all the regressors and we need not to manually impute that, like pedict fit, X1 ; predict fit X2; predict fit X3 etc..
* Square the fitted values

Code:

gen fit2=fit^2

* Estimate the model with the additional regressor

Code:

ppmlhdfe Y lnX1 lnX2 lnX3 Z1 Z2 Z3 k_ik_efit2, robust

* Test the significance of the additional regressor (this is equivalent to a t-test on fit2)

Code:

test fit2=0

Please get back to me, i shall be very thankful (Regards)
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3021
#17

22 Feb 2022, 00:43

Dear Ridwan Sheikh,

1) That is correct; the interpretation of the parameters depends on the model, not on the estimator.
2) You need to do that transformation to have the exact estimated effect in percentage terms, but that is still a semi-elasticity. The elasticities for these variables are not constant, so it is generally better to look just at the semi-elasticities.
3) What you are doing is correct, just make sure you absorb the fixed effects.

Best wishes,

Joao
Comment
Ridwan Sheikh

Join Date: Apr 2021

Posts: 172
#18

22 Feb 2022, 10:35

Thanks Joao Santos Silva
This was helpful .

RAMSEY RESET

Just few more clarifications (sorry):
1) Am i doing it right, if i write the following line of codes :

Code:

ppmlhdfe Y lnX1 lnX2 lnX3 Z1 Z2 Z3 k_ik_e, absorb(ki ke) robust

Code:

predict fit, xb

Code:

gen fit2=fit^2

Code:

ppmlhdfe Y lnX1 lnX2 lnX3 Z1 Z2 Z3 k_ik_efit2, absorb(ki ke) robust

Code:

test fit2=0

2) What if we choose cluster (distance) as standard error, instead of robust. Is the test still valid ?

3) I am using PPMLHDFE estimator at sectoral level trade-data (estimating coefficient estimates sector-by-sector), do i need to test for Ramsey Test sector-by-sector also ?

Thanks and regards
(Ridwan)
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3021
#19

23 Feb 2022, 03:02

Dear Ridwan Sheikh,

1) The fixed effects should not be included as regressors; they should be absorbed.

2) You should cluster!

3) That is up to you, but if you do keep in mind that you are performing multiple tests and consider correcting for that.

Best wishes,

Joao
Comment
Ridwan Sheikh

Join Date: Apr 2021

Posts: 172
#20

23 Feb 2022, 17:22

Thank you very much Joao Santos Silva
This was helpful..
Regards,
(Ridwan)
Comment
Ridwan Sheikh

Join Date: Apr 2021

Posts: 172
#21

31 Mar 2022, 01:01

Dear Joao Santos Silva
I was reading your discussion paper - The Log of Gravity at 15. In that paper you discussed about the incidental parameter problem that may arise under the situations of panel data with lot of origin and destination fixed effects (keeping T fixed and N grows infinitely large). However, you further discussed in section 3.2 that PPML is immune to incidental parameter problem, but we cannot account for clustering due to incidental parameter problem and the standard practice is to cluster by country-pair (footnote-8, section 3.2)
In my case case i use PPMLHDFE STATA command with absorb() option, But i am not clustering by pair_id, rather i cluster by distance. My understanding is that distance is pair identifier (defined by country pairs- dyadic variable) and should be same as clustering by pair_id.

I want to ask, whether that is a right approach or i am doing it wrong ?

Thanks and regards,
(Ridwan)
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3021
#22

31 Mar 2022, 04:15

Dear Ridwan Sheikh,

Clustering by distance will be equivalent to clustering by pair_id as long as no two pairs have the same distance.

Best wishes,

Joao
Comment
Ridwan Sheikh

Join Date: Apr 2021

Posts: 172
#23

01 Apr 2022, 22:40

Thanks Joao Santos Silva
This was greatly helpful.
1 like
Comment
Ebru Aricioglu

Join Date: Jan 2023

Posts: 2
#24

11 Apr 2023, 09:01

Dear all,
I want to estimate the gravity equation with ppml_fe_bias, but stata gives this warning "note: because of the size of the data, an approximation will be used to compute
> the adjusted variance. Use the -exact- option if you wish to compute the var
> iance exactly.
analyticalbiascorrection(): 3900 unable to allocate real <tmp>[2062581,59]
<istmt>: - function returned error"
How can I fix it? Thanks in advance for your help
Ebru
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3021
#25

11 Apr 2023, 09:14

Dear Ebru Aricioglu,

Please check that your Stata is up-to-date, and that you have the most recent version of the command.

Best wishes,

Joao
Comment

Tahmid Labib

Join Date: Feb 2023
Posts: 2

#26

10 Jun 2023, 10:45

Dear Joao Santos Silva,

I am currently conducting a gravity model-based analysis on the trade of Bangladesh. The panel dataset I am using covers the period from 2011 to 2021 and includes export and import data for the top 20 trading partners, which account for approximately 88% of Bangladesh's total trade.

For the export side analysis, I have specified the model as follows:

Code:

ppml export_bd ln_distance ln_tariff_p_applied lngdp_d lngdp_o common_language

Code:

ppml export_bd ln_distance ln_tariff_p_cf lngdp_d lngdp_o common_language

Code:

ppml export_bd ln_distance ln_tariff_p_applied lngdp_d lngdp_o common_language contiguity landlocked_d island_d

Code:

ppml export_bd ln_distance ln_tariff_p_cf lngdp_d lngdp_o common_language contiguity landlocked_d island_d

Code:

ppmlhdfe export_bd ln_distance ln_tariff_p_applied lngdp_d lngdp_o common_language contiguity landlocked_d island_d, abs (iso3_d)

Code:

ppmlhdfe export_bd ln_distance ln_tariff_p_cf lngdp_d lngdp_o common_language contiguity landlocked_d island_d, abs(iso3_d)

However, none of the coefficients in the export analysis are statistically significant, and they exhibit the wrong sign. Additionally, the value of ln_tariff_p_cf should be higher than ln_tariff_p_applied. Interestingly, after controlling for partner fixed effects, the coefficients become statistically significant but still show the wrong sign. The results are:

Code:

 
 (1)
 (2)
 (3)
 (4)
 (5)
 (6)


  export_bd
  export_bd
  export_bd
  export_bd
  export_bd
  export_bd

ln_distance
1.131***
1.127***
1.441***
1.438***




(.074)
(.075)
(.082)
(.082)



ln_tariff_p_app~d
-.819

-.464

.433*



(.736)

(.538)

(.233)


lngdp_d
.507***
.508***
.493***
.496***
1.014
1.012


(.043)
(.042)
(.05)
(.049)
(.968)
(.968)

lngdp_o
.566*
.564*
.584*
.584*
.45
.451


(.307)
(.307)
(.308)
(.308)
(.424)
(.424)

common_language
-.646***
-.653***
-.845***
-.848***




(.114)
(.116)
(.112)
(.115)



ln_tariff_p_cf

-.674

-.594

.434**



(.541)

(.485)

(.209)

contiguity


1.547***
1.542***






(.165)
(.165)



landlocked_d


-.069
-.067






(.175)
(.174)



island_d


.466***
.481***






(.139)
(.141)



_cons
-22.629***
-22.571***
-25.506***
-25.543***
-24.139
-24.1


(8.513)
(8.516)
(8.556)
(8.557)
(21.719)
(21.717)

Observations
18589
18589
18589
18589
18589
18589

Pseudo R²
.z
.z
.z
.z
.172
.172

Scenario
Baseline
Counterfactual
Baseline
Counterfactual
Baseline
Counterfactual

Partner Fixed Effect
No
No
No
No
Yes
Yes

Standard errors are in parentheses

*** p<.01, ** p<.05, * p<.1

In the baseline scenario, I use the current applied tariff rate (ln_tariff_p_applied), while the counterfactual scenario includes a hypothetical tariff rate (ln_tariff_p_cf) that is higher than the applied tariff rate.

On the other hand, when I model the import side of Bangladesh using the specified specifications, the gravity holds, and the results are statistically significant with the expected sign (in this case, ln_tariff_p_cf should be lower than ln_tariff_p_applied).

Code:

 
 (1)
 (2)
 (3)
 (4)
 (5)
 (6)


  import_bd
  import_bd
  import_bd
  import_bd
  import_bd
  import_bd

ln_distance
-.8***
-.865***
-1.051***
-1.039***




(.047)
(.047)
(.059)
(.06)



ln_tariff_bd
-3.832***

-4.747***

-4.89***



(.512)

(.517)

(.527)


lngdp_d
.695***
.708***
.77***
.773***
.584
.77


(.039)
(.038)
(.035)
(.035)
(.51)
(.504)

lngdp_o
.567***
.652***
.592***
.642***
.722**
.652**


(.202)
(.199)
(.198)
(.196)
(.289)
(.291)

ln_tariff_bd_cf

-.021

-.07***

-.078***



(.014)

(.015)

(.017)

common_language


.199***
.245***






(.061)
(.062)



contiguity


-.954***
-.775***






(.16)
(.155)



landlocked_d


-.579***
-.56***






(.156)
(.157)



island_d


-.316***
-.559***






(.111)
(.124)



_cons
-11.535**
-14.013**
-12.136**
-13.887**
-18.656*
-22.278**


(5.679)
(5.592)
(5.546)
(5.512)
(9.883)
(9.721)

Observations
15636
15636
15636
15636
15636
15636

Pseudo R²
.z
.z
.z
.z
.322
.304

Scenario
Baseline
Counterfactual
Baseline
Counterfactual
Baseline
Counterfactual

Partner Fixed Effect
No
No
No
No
Yes
Yes

Standard errors are in parentheses

*** p<.01, ** p<.05, * p<.1

I have come up with a few explanations for this phenomenon. Firstly, Bangladesh's exports are heavily concentrated in regions where they benefit from GSP privileges and duty-free quota access. As a result, the current gravity model may not adequately capture the influence of distance and tariffs on these specific trade patterns. Moreover, a significant portion of Bangladesh's total exports, approximately 83%, relies on just three products, which could contribute to the observed results.

In contrast, the imports of Bangladesh exhibit higher diversification in terms of both product types and origin countries. Additionally, Bangladesh's market is subject to various types of tariffs, indicating a higher degree of protectionism. Consequently, the gravity model performs better when analyzing imports, as these factors align with the model's assumptions and expectations.

I would appreciate your opinion on my explanation and whether you have any other insights or suggestions regarding this issue.

Thank you for your attention.

Best regards,
Tahmid

Last edited by Tahmid Labib; 10 Jun 2023, 10:49.

Comment

Joao Santos Silva

Join Date: Apr 2014

Posts: 3021
#27

12 Jun 2023, 03:06

Dear Tahmid Labib,

You know much more about this particular case than I do, but one thing that I would notice is that your sample, by focusing on the main trading partners, is not representative. Therefore your results are difficult to generalize.

Best wishes,

Joao
1 like
Comment
Tahmid Labib

Join Date: Feb 2023

Posts: 2
#28

16 Jun 2023, 01:02

Originally posted by Joao Santos Silva View Post

Dear Tahmid Labib,

You know much more about this particular case than I do, but one thing that I would notice is that your sample, by focusing on the main trading partners, is not representative. Therefore your results are difficult to generalize.

Best wishes,

Joao

Thanks for your observation. I will expand the sample size then.

Regards,
Tahmid
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment