Difference-in-difference and/or event-study on an unbalanced panel

Luna Diaz

Join Date: Feb 2023

Posts: 14
#1

Difference-in-difference and/or event-study on an unbalanced panel

29 May 2026, 16:37

Hi everyone,

I am working with an unbalanced individual-level panel from a survey and would appreciate some guidance on whether any of the staggered-adoption DiD estimators is appropriate in my setting. The panel spans up to 8 survey waves (which in the actual data correspond to calendar years), but it is far from balanced. Individuals may enter the survey after the first wave, and many respondents miss one or more intermediate waves. Treatment is an individual-level shock that occurs at different times for different individuals and is absorbing once it occurs. A simplified toy example is shown below:

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input float(id time y shock gvar) 1 1 9 0 0 1 2 9 0 0 1 3 11 0 0 1 4 12 0 0 1 5 11 0 0 1 6 12 0 0 1 7 12 0 0 2 1 9 0 0 2 2 9 0 0 2 3 8 0 0 2 4 9 1 4 2 5 8 0 4 2 6 7 0 4 2 7 8 0 4 2 8 9 0 4 3 1 12 0 0 3 2 12 0 0 3 3 11 0 0 3 4 13 0 0 3 5 15 0 0 3 6 14 1 6 3 7 16 0 6 3 8 16 0 6 4 3 10 0 0 4 4 10 0 0 4 5 12 0 0 4 6 12 0 0 4 7 13 1 7 4 8 14 0 7 5 1 11 0 0 5 2 13 0 0 5 3 12 0 0 5 4 12 0 0 5 5 11 0 0 5 8 9 0 0 end

Since this is clearly a staggered-adoption setting, I was considering estimators such as Callaway & Sant'Anna (e.g. csdid) or an event-study approach along the lines of Clarke and Tapia-Schythe (2020, eventdd). In fact, the "gvar" variable reported above is constructed to be used as the group variable in csdid. However, my understanding is that many of these implementations either require or strongly prefer a balanced panel structure.

One idea I considered was constructing an alternative time variable based on the number of survey appearances for each individual. For example, I could keep only respondents observed e.g. 5 times and define a within-individual time index running from 1 to 5, regardless of the underlying survey year. This would create something closer to a balanced panel. However, my concern is that doing so would discard the calendar-time dimension, which seems important to track any time or cohort effects.

Is there a standard way to handle this kind of unbalanced survey panel within the Callaway-Sant'Anna framework (or related estimators), or would the missing waves and staggered entry create identification problems that require a different approach? Any advice on how to proceed would be greatly appreciated!

Last edited by Luna Diaz; 29 May 2026, 16:42.
Tags: None
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2291
#2

30 May 2026, 08:07

Fernando Rios-Avila's command jwdid handles unbalanced panels without difficulty. In fact, it uses all available information -- unlike differencing approaches such as Callaway-Sant'Anna, which only uses an observation if both time periods are available. If you have covariates x1 ... xK, do the following:

Code:

jwdid y x1 ... xK, ivar(id) tvar(time) gvar(cohort) never estat event estat plot estat simple

If you drop the "never" option, it produces the "lags only" version of the estimator. If you don't have covariates, just drop them from the command.
1 like
Comment
Luna Diaz

Join Date: Feb 2023

Posts: 14
#3

30 May 2026, 16:06

Thank you, Jeff Wooldridge! I really appreciate your suggestion, as I was not aware of the jwdid command. I tried it on my actual dataset and it seems to be working well.

I will now take a more in-depth look at the literature/documentation, in order to make sure I am not disregarding any assumption.
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2291
#4

01 Jun 2026, 10:06

Luna: The regression-based methods rely on the same kind of no anticipation and conditional parallel trends assumption as other estimators, including CS. The doubly robust estimators in csdid have some resilience to functional form, but it doesn't allow violation of PT in levels -- which to me is the most important limitation of linear models whether one uses csdid or jwdid.
2 likes
Comment

Announcement

Difference-in-difference and/or event-study on an unbalanced panel

Comment

Comment

Comment