Meta-analysis of Difference in Differences. How to account for pre-post correlation?

Gianfranco Di Gennaro

Join Date: Oct 2020

Posts: 140
#1

Meta-analysis of Difference in Differences. How to account for pre-post correlation?

05 Jun 2025, 01:25

Dear all, I should do a meta-analysis of randomized studies.
The outcome is a proportion (of antibiotic prescriptions) measured at baseline and follow-up, in two groups of doctors, experimental (educational intervention) and control.

How would you approach this meta-analysis?
I would be interested in expressing the results as difference in differences (DiD).
I find it difficult to implement this approach in STATA.
More than anything, I wonder how to take into account the pre-post correlation in order to correctly estimate the standard errors of the DiD.

Does anyone have an idea about this?

I thank you in advance!

Code:

* Example generated by -dataex-. For more info, type help dataex clear input byte studyid double(baselinectrl followupctrl baselineexp followupexp) byte(nctrl nexp) float(DiD se_DiD) 1 .23 .22 .24 .12 50 51 -.11 .11234348 2 .14 .14 .15 .13 78 78 -.02 .07856077 3 .21 .23 .21 .17 67 67 -.06 .09849615 4 .24 .22 .23 .22 54 55 .01 .1135586 5 .14 .13 .15 .12 13 13 -.02 .18945265 6 .17 .16 .18 .12 43 43 -.05 .1108844 end
Tags: None
Tiago Pereira

Join Date: Jan 2016

Posts: 387
#2

05 Jun 2025, 15:20

What was the study design of the original studies? Were they standard RCTs with the physician as the unit of randomization, or were they cluster RCTs?
Comment

Erik Ruzek

Join Date: Oct 2017
Posts: 426

08 Jun 2025, 13:41

I am interested in the answer to the question asked by Tiago Pereira. Assuming you have simple random assignment (physician as randomization unit), do you have the estimates of the pre-post correlations from each study? You do not show them, so I will assume you do not have them. In that case, you would have to make up a correlation coefficient for incorporating it into the formula for the se_DID. You could estimate standard errors under different hypothetical correlations to see how that impacts the meat-analytic DiD effect standard error. Below is some code to do so:

Code:

clear
version 16.1
input byte studyid double(baselinectrl followupctrl baselineexp followupexp) byte(nctrl nexp) float(DiD se_DiD)
1 .23 .22 .24 .12 50 51 -.11 .11234348
2 .14 .14 .15 .13 78 78 -.02 .07856077
3 .21 .23 .21 .17 67 67 -.06 .09849615
4 .24 .22 .23 .22 54 55  .01  .1135586
5 .14 .13 .15 .12 13 13 -.02 .18945265
6 .17 .16 .18 .12 43 43 -.05  .1108844
end

renvars baseline* , postfix(0)
renvars followup*, postfix(1)
renvars baseline* followup*, predrop(8)

foreach n of numlist 0 1 {
    gen ctrl`n'se = sqrt((ctrl`n')*(1-ctrl`n')/nctrl)
    gen exp`n'se = sqrt((exp`n')*(1-exp`n')/nexp)
}

gen ctrldiff = ctrl1 - ctrl0
gen expdiff = exp1 - exp0

gen ctrldiffse = sqrt(ctrl0se^2 + ctrl1se^2)
gen expdiffse = sqrt(exp0se^2 + exp1se^2)

gen DID = expdiff - ctrldiff
gen se_DID = sqrt(expdiffse^2 + ctrldiffse^2)

* Verify that my figures are the same as yours
list DiD DID se_DiD se_DID, sep(6)
  +---------------------------------------------+
  | studyid    DiD    DID     se_DiD     se_DID |
  |---------------------------------------------|
  |       1   -.11   -.11   .1123435   .1123435 |
  |       2   -.02   -.02   .0785608   .0785608 |
  |       3   -.06   -.06   .0984961   .0984962 |
  |       4    .01    .01   .1135586   .1135586 |
  |       5   -.02   -.02   .1894526   .1894526 |
  |       6   -.05   -.05   .1108844   .1108844 |
  +---------------------------------------------+

To add information about the correlation between pre and post without any idea about what the actual correlation is, you would need to make an educated guess based on past literature about this to incorporate it into the se_DID calculations. Here I do a range just to see where the adjustment makes the most difference:

Code:

gen se_DID_r_pt1 = sqrt(expdiffse^2 + ctrldiffse^2 - (2 * 0.1 * expdiffse * ctrldiffse))
gen se_DID_r_pt3 = sqrt(expdiffse^2 + ctrldiffse^2 - (2 * 0.3 * expdiffse * ctrldiffse))
gen se_DID_r_pt5 = sqrt(expdiffse^2 + ctrldiffse^2 - (2 * 0.5 * expdiffse * ctrldiffse))
gen se_DID_r_pt7 = sqrt(expdiffse^2 + ctrldiffse^2 - (2 * 0.7 * expdiffse * ctrldiffse))
gen se_DID_r_pt9 = sqrt(expdiffse^2 + ctrldiffse^2 - (2 * 0.9 * expdiffse * ctrldiffse))

line se_DID se_DID_r_pt1-se_DID_r_pt9 studyid, ///
    scheme(cleanplots) legend(subtitle("Correlation adjustment") ///
    order(1 "No adjustment" 2 "r = 0.1" 3 "r = 0.3" 4 "r = 0.5" ///
    5 "r = 0.7" 6 "r = 0.9"))

The resulting plot shows that the higher the assumed correlation, the more meaningfully it impacts the standard errors such that larger pre-post correlations shrink the standard error.

Announcement

Meta-analysis of Difference in Differences. How to account for pre-post correlation?

Comment

Comment