Gravity equation with importer-specific time-varying variable

Selena Zhou

Join Date: Aug 2025

Posts: 2
#1

Gravity equation with importer-specific time-varying variable

Yesterday, 08:05

Dear Statalist Community,

I am new to Stata and I need some help for my thesis, I am an Economics major. I have to run a gravity equation to study how trade flows (exports) between countries vary according to a certain risk index. This index changes over time and it is defined only for the importing country.

My problem is that if I include importer fixed effects, the variable drops. From what I have read in guides on gravity equations (e.g. WTO handbook), it seems standard to include exporter×year and importer×year fixed effects (to capture multilateral resistance), and country-pair fixed effects plus year fixed effects. But if I do that, I cannot estimate the effect of my risk index (importer-specific variable) anymore.

This is the specification I tried to use:

ppmlhdfe tradeflow_baci ln_risk ln_dista contig comcol, absorb(exp_id#year imp_id#year pair_id)

With gives me the following result

note: 1 variable omitted because of collinearity: ln_risk

So my questions are:

1. Is it always necessary to include the full sets of fixed effects (exporter×year, importer×year, and pair FE) for the model to be empirically sound?

2. If my research question is specifically about the effect of this importer-specific, time-varying variable, is it acceptable to drop the exporter FE + year FE so that I can estimate its effect?

If you have any suggestions of theory references or practical advice on what specification would be sound in this situation, I would be extremely grateful. I am not too familiar with gravity models or STATA.

Thank you very much!

Last edited by Selena Zhou; Yesterday, 08:07.
Tags: None
Joao Santos Silva

Join Date: Apr 2014

Posts: 3025
#2

Yesterday, 08:34

Dear Selena Zhou,

The inclusion of those fixed effects is important for the results to have a structural interpretation. However, as you point out, if you include them you cannot identify the effect of the variable of interest. In that case, my suggestion is that you do not include those fixed effects, and estimate a "naive" gravity equation, including traditional county-specific regressors such as GDP and landlock indicators. You can also try to include pair fixed effects. In any case, take your estimates with a pinch of salt because you know that they are based on a model that is not ideal.

Best wishes,

Joao
Comment
Selena Zhou

Join Date: Aug 2025

Posts: 2
#3

Yesterday, 09:20

Dear Dr. Santos Silva,

Thank you so much for taking time to provide me with an answer, that was really helpful! If any papers or textbooks come to mind that justify using a more “naive” gravity approach in this type of context, I would be very grateful for the references.

Regarding the naive estimation you mentioned, would it imply that I could just set up a baseline model along the following lines?

Code:

ppmlhdfe tradeflow_baci ln_gdp_imp ln_gdp_exp ln_risk contig comcol, absorb(year)

I noticed that when I add ln_dist to the regression, the coefficient on ln_risk flips sign (from negative to positive), while still being significant. The same happens if I add absorb(pair_id) in ppmlhdfe. How should I interpret such a change in sign? And is it acceptable to omit distance in that case? This sign-flip problem seems to arise only when I use ppmlhdfe.

If I run a simple OLS log-linear gravity model like the following:

Code:

regress ln_exp ln_gdp_exp ln_gdp_imp ln_dista ln_risk, robust cluster(pair_id)

the coefficient on ln_risk keeps the expected sign. Do you have any suggestions on how I could improve my models, or whether I may have a more structural problem in the way I am specifying them?

Best wishes,
Selena

Last edited by Selena Zhou; Yesterday, 09:53.
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3025
#4

Today, 00:57

Dear Selena Zhou,

Please ignore the results obtained with OLS as we know that those are invalid. Do you really want to absorb year? Also, please cluster by distance, not pair_id.

About the sign reversal. You should not drop distance just because the results are not what you expect. If you change the model until it gives you what you expect, you do not need a model at all :-)

So, I would certainly keep the variable distance (or include pair fixed effects), and try to include other variables such as RTA and CU indicators and consider expanding your dataset.

Best wishes,

Joao
Comment

Announcement

Gravity equation with importer-specific time-varying variable

Comment

Comment

Comment