Understanding multi-way clustering results from PPMLHDFE gravity estimation

lishu zhang

Join Date: May 2023

Posts: 6
#1

Understanding multi-way clustering results from PPMLHDFE gravity estimation

16 May 2023, 05:39

Dear all,

As I read that ignoring multiway clustering in estimating a gravity model leads to misleading inference (Egger and Tarlea, 2015), I was attempting to add multi-way clustering into my estimation.

My command used is as follows:
ppmlhdfe y x1#x2, absorb(panelID prod_sector_year Inv_year) vce(cluster Inv_Country ProdCountry year). //here I wanted to impose the clustering at source country, destination country and year levels.

However, I do not understand how the number of clusters for which the standard errors were adjusted were determined (the last line in the picture).

And how should I understand extremely small residual df here? I am very confused by this number of residual df, especially in comparison to the attempt where I impose vce(cluster panelID#year). Because I think that the second clustering here is more demanding:
ppmlhdfe y x1#x2, absorb(panelID prod_sector_year Inv_year) vce(panelID#year). //here I wanted to impose the clustering at every combination of source country X destination country X year.

Is there any problem in my codes? Did I accidentally estimate something different than I wanted?

I appreciate any help and comments!

Best regards,
Lishu
Tags: None
Joao Santos Silva

Join Date: Apr 2014

Posts: 3065
#2

16 May 2023, 07:10

Dear lishu zhang,

When you cluster, the effective number of observations is the number of clusters; when you do multi-clustering, it is the smallest of these. So, in the first case, if you have 20 years, it is as if you have 20 observations. In the second case, your clusters are very small (in ID in a year), and therefore you have many of them.

Best wishes,

Joao
Comment
lishu zhang

Join Date: May 2023

Posts: 6
#3

16 May 2023, 07:48

Dear Joao Santos Silva,

Thank you very much for the information.

If I understood correctly, what the first line of codes does is not dissecting my sample subsequently along each cluster dimension but imposing the one with smallest number of groups.

If I may have a follow-up question: is vce(cluster unit1 unit2 unit3) a correct practice of multi-way clustering? to me it seems like (choosing the strongest) one-way clustering. Or should I take my second line of codes as the implementation of the real multi-way clustering?

Thank you in advance for your help!

Best regards,
Lishu
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3065
#4

16 May 2023, 23:05

Dear lishu zhang,

I do not think that is right: I assume that it clusters along all dimensions, and reports the size of the smaller one. Your first line is the right way of doing multi-way clustering.

Best wishes,

Joao
Comment
lishu zhang

Join Date: May 2023

Posts: 6
#5

17 May 2023, 02:17

Dear Joao Santos Silva,

I see. Thank you for the explanation!

Best regards,
Lishu
Comment
Ridwan Sheikh

Join Date: Apr 2021

Posts: 179
#6

25 Apr 2026, 02:28

Dear Joao Santos Silva,

In a structural gravity model estimated using ppmlhdfe, I want to implement all the diagnostics as metioned in Egger and Tarlea (2015), that is;

(a) Huber–White-type robust standard errors without clustering
(b) Standard errors clustered at (and may be correlated over time within) country pairs
(c) Standard errors clustered at (and may be correlated within) base groups (importer, exporter, and year), as well as every combination of the three.
(d) Same as (c), except for country-pairs being dyadic (symmetric for ij and ji).

For (a) and (b), I did the following

Code:

egen idt_ci = group(iso_i year) egen idt_cj = group(iso_j year) egen id_ci_cj = group(iso_i iso_j) * (a) Huber-White-type Standard errors without clustering ppmlhdfe trade x1, a(idt_ci idt_cj id_ci_cj) vce(robust) * (b) Standard errors are clustered at country pairs ppmlhdfe trade x1, a(idt_ci idt_cj id_ci_cj) vce(cluster id_ci_cj)

I am not sure, how to implement (c) and (d) and what does a base group mean?

Egger and Tarlea (2015) further refer to case (c) as "multi-way clustering assuming asymmetric pair-wise components" and case (d) as "multi-way clustering assuming symmetric pairwise (dyadic) components".

I shall be very thankful, if you provide STATA code as how to create these base groups/symmetric and assymetric country pairs, and in general how to implement (c) and (d) using ppmlhdfe.

Thank you,
(Ridwan)

Last edited by Ridwan Sheikh; 25 Apr 2026, 02:30.
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3065
#7

25 Apr 2026, 08:41

Dear Ridwan Sheikh,

At least the current (2026) version of ppmlhdfe allows multyway clustering, so it should be easy to do all of that.

Best wishes,

Joao
Comment
Ridwan Sheikh

Join Date: Apr 2021

Posts: 179
#8

27 Jul 2026, 02:51

Thanks Joao Santos Silva ! I am sorry for the late reply. I have done it. Thanks for the help.
1 like
Comment

Announcement

Understanding multi-way clustering results from PPMLHDFE gravity estimation

Comment

Comment

Comment

Comment

Comment

Comment

Comment