Diff in Diff: DRDID and CSDID

David Sanders-Ellis

Join Date: Oct 2025

Posts: 2
#616

14 Oct 2025, 14:06

Hi Fernando

Thanks for the advice! I'll try with a more consistent group / run some placebo tests and see how it looks.

Thanks again!

David
Comment

Mateus Maciel

Join Date: Mar 2021
Posts: 67

#617

13 Feb 2026, 17:46

I have seen the same error in some other posts, but none of the solutions worked for me. I am running the following code with the long2 option:

Code:

csdid per_des,  ivar(id_municipio) gvar(lag_treat) time(ano) long2 method(reg)
estat event, window(-5 5) estore(cs)
csdid_plot, legen(off) xlabel(#10,labsize(large)) xtitle("Years relative to treatment",size(large)) ylabel(#5,labsize(large)) ytitle("ATT",size(large))
estat simple, estore(arrecadacao_total)
graph export "figures/desmatamento_total.pdf", replace

Nevertheless, I keep getting this error message:

Code:

. estat event, window(-5 5) estore(cs)
ATT by Periods Before and After treatment
Event Study:Dynamic effects
                       *:  3200  conformability error
           csdid_event():     -  function returned error
                 <istmt>:     -  function returned error

Last edited by Mateus Maciel; 13 Feb 2026, 17:48.

Comment

Jeff Wooldridge

Join Date: Apr 2014

Posts: 2291
#618

15 Feb 2026, 16:02

Mateus: Is it true the error disappears if you drop window(-5 5)? You might try jwdid, as that accomplishes the same thing via running a single TWFE regression.

Code:

jwdid per_des, ivar(id_municipio) tvar(ano) gvar(lag_treat) never estat event, window(-5 5) estore(cs)
Comment
YITING HUANG

Join Date: Mar 2026

Posts: 6
#619

24 Mar 2026, 00:17

Dear @FernandoRios,

Thank you for developing the incredibly helpful csdid package. I am currently using it for my Master's thesis to estimate the dynamic effects of a fertility shock on gig economy income, using a balanced panel data from 2011 to 2023. My treatment cohorts (first childbirth) occur between 2017 and 2023.

I am running the following command:

Code:

csdid gigjob_income c.age##c.age, ivar(mom_id) time(year) gvar(first_birth_year) long2 estat event, window(-4 4)

However, the output table displays event-time coefficients starting from Tm5 instead of Tm4, as shown below:

Code:

------------------------------------------------------------------------------ | Coefficient Std. err. z P>|z| [95% conf. interval] -------------+---------------------------------------------------------------- Pre_avg | -44.71372 29.70189 -1.51 0.132 -102.9283 13.50091 Post_avg | 516.9128 70.77064 7.30 0.000 378.2049 655.6207 Tm5 | -44.67074 32.52688 -1.37 0.170 -108.4223 19.08079 Tm4 | -42.92353 32.55161 -1.32 0.187 -106.7235 20.87647 Tm3 | -43.42168 31.83051 -1.36 0.173 -105.8083 18.96496 Tm2 | -47.83893 27.54437 -1.74 0.082 -101.8249 6.147045 Tp0 | 7.317326 25.82761 0.28 0.777 -43.30385 57.9385 Tp1 | 177.32 44.67611 3.97 0.000 89.75648 264.8836 Tp2 | 526.611 69.73675 7.55 0.000 389.9295 663.2926 Tp3 | 791.2632 106.6887 7.42 0.000 582.1571 1000.369 Tp4 | 1082.053 157.5165 6.87 0.000 773.326 1390.779 ------------------------------------------------------------------------------

Based on my manual calculation, the Pre_avg (-44.71372) is the exact simple average of the displayed coefficients from Tm2 to Tm5.

Could you kindly clarify the following points regarding the underlying mechanism of this output?
Why is Tm5 displayed when the window is explicitly set to (-4 4)?

Does Tm5 represent "endpoint binning"? Since my panel data traces back to 2011 (meaning that for the 2017 cohort, pre-treatment periods go up to e = -6), does the Tm5 coefficient represent an aggregated/binned average of all early periods (e <= -5)? Or does it strictly represent the isolated relative time e = -5?

I want to ensure I interpret and report the pre-trend test and the event window correctly in my thesis. Any clarification would be greatly appreciated.

Thank you very much for your time and your contribution to the Stata community.

Last edited by YITING HUANG; 24 Mar 2026, 00:20.
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2534
#620

31 Mar 2026, 06:25

I dont remember complletly, but i think that is because -4 meant (for me) to use 4 periods before treatment (-5 to -2) so you can use -3 if you want up to -4
and No, window event DOES NOT show binning. It ignores Treatments above or below the shown threshold
F
1 like
Comment
YITING HUANG

Join Date: Mar 2026

Posts: 6
#621

09 Jun 2026, 06:20

Dear Statalist,

I am using csdid2 (Callaway & Sant'Anna 2021) with a balanced panel of N = 753,050 individuals observed over 13 years (2011–2023).
All individuals are eventually treated (first childbirth between 2017–2023); I use the not-yet-treated as the control group (notyet option). The base period is g-1 (universal base, default).

After estimation, the reported "Number of obs" is 9,036,600, which is exactly 753,050 × 12 — one fewer year per individual than the full balanced panel (9,789,650 = 753,050 × 13).

My understanding is that the base period (t = -1, i.e., g-1 for each individual) is used as the reference for first-differencing and therefore does not count as an independent estimation period, reducing the reported N by one observation per individual.

Is this the correct explanation for the discrepancy? Is there any documentation or reference that explicitly describes this behavior?

I note that a similar pattern appears in published work using csdid with the same base period convention, where t = -1 is omitted from the event-study table entirely, consistent with this interpretation.

Thank you in advance.
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2534
#622

09 Jun 2026, 06:27

You are on point. The other approach you can use is to report the number of panel observations rather than Inidividual x time
Or that would be my suggestion
1 like
Comment
YITING HUANG

Join Date: Mar 2026

Posts: 6
#623

09 Jun 2026, 06:32

Dear FernandoRios

Thank you so much for your prompt and helpful reply. I will follow your suggestion and report the number of individuals.
Comment
YITING HUANG

Join Date: Mar 2026

Posts: 6
#624

10 Jun 2026, 08:49

Dear Statalist,

I am using csdid with not-yet-treated units as the control group in a large administrative panel, with approximately nine million person-year observations in the gender-specific estimation sample.

In the event-study plot, the pre-treatment coefficients look visually close to zero and economically small, but the aggregated pre-treatment average is statistically significant.

For example, for fathers' delivery participation:
Outcome: Has Delivery Income Pre_avg = -0.0004, p < 0.05 Post_avg = 0.0029, p < 0.01
The pre-treatment estimate is about -0.04 percentage points, while the post-treatment effect is about +0.29 percentage points. The post-treatment effects are much larger and increase after childbirth.

In this setting, is it reasonable to discuss the parallel trends assumption mainly in terms of visual trajectory and economic magnitude, while explicitly acknowledging that some pre-treatment estimates are statistically significant due to the very large sample size? Are there recommended ways to present this issue when using csdid with large administrative data?

Thank you in advance.
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2534
#625

11 Jun 2026, 06:41

I wonder if your treatment variable is incorrectly identified.
So not sure what you mean with "has delivery income". But what if the impact of the treatment starts BEFORE they start to actively recive that income (changing base ). If you set this at t-2 as nontreatment instead of t-1, the pretreatment will be non significant
THe question, in that case, is when does the treatment has an effective impact on the outcome. Right away? or few periods before?
For example...for a topic I worked on Parenthood. Does the treatment start When the baby is born? or when it is conceived? or when parents start considering having a baby?
F
1 like
Comment
YITING HUANG

Join Date: Mar 2026

Posts: 6
#626

11 Jun 2026, 09:07

Dear FernandoRios,

Thank you very much for your reply. This is very helpful.

Just to clarify, “Has Delivery Income” is an outcome variable, defined as an indicator for whether the individual received income from identified food-delivery platforms in a given calendar year. The treatment variable is the year of first childbirth.

I think your point about the timing of the effective treatment is very relevant for my setting. Since the outcomes are measured annually, some labor-supply adjustments may begin before the actual birth year, for example during pregnancy or when parents start preparing for childbirth. In that case, the coefficient at t = -1 may partly capture anticipatory behavior rather than a pure violation of parallel trends.

Would it be reasonable to interpret and discuss some statistically significant pre-treatment estimates in this way—that is, as potentially reflecting anticipatory labor-supply responses before childbirth, while still presenting the results cautiously?

Thank you again for your helpful suggestion.

Best regards,
YITING
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2534
#627

11 Jun 2026, 13:32

I would re-frame the treatment.
The goal of using a pre-treament period is that we can identify a period where outcomes are paralllel because units are not yet treated.
Normally we do this using the t-1...but there is nothing to say we cannot use an earlier period (t-2)
I think that would be a better approach.
(or do an adhoc adjustment, since estimating all numbers may take a bit of time )
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2291
#628

12 Jun 2026, 09:47

Yiting: Here is what I would do. Both csdid and jwdid with the never option estimate separate effects for all possible combinations, you can simply shift the treatment year to be the year before the birth of the first child. This will force the reference period to be two years before the birth of the first child. The year just before becomes a "treated" period and so you can see if there are practical and statistically significant differences in the year before the first birth as a placebo test. With a balanced panel, the estimated treatment effects will be identical to if you just drop the data on the year before the first birth (because then it will use the two years before the first birth as the reference period). To me, it makes sense to keep all of the data, shift back the treatment year, and then see if it makes a difference.

One suggestion: jwdid allows for binary outcomes by using a logit model. This means the (conditional) parallel trends assumption is different, and so it's a useful robustness check. You can estimate the effects on the probability of "has delivery income = 1) and compare those with the linear model. You can do this by setting the first "treatment" period to be the year before the birth, or the year of the birth.
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment