Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Difference-in-difference and/or event-study on an unbalanced panel

    Hi everyone,

    I am working with an unbalanced individual-level panel from a survey and would appreciate some guidance on whether any of the staggered-adoption DiD estimators is appropriate in my setting. The panel spans up to 8 survey waves (which in the actual data correspond to calendar years), but it is far from balanced. Individuals may enter the survey after the first wave, and many respondents miss one or more intermediate waves. Treatment is an individual-level shock that occurs at different times for different individuals and is absorbing once it occurs. A simplified toy example is shown below:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(id time y shock gvar)
    1 1  9 0 0
    1 2  9 0 0
    1 3 11 0 0
    1 4 12 0 0
    1 5 11 0 0
    1 6 12 0 0
    1 7 12 0 0
    2 1  9 0 0
    2 2  9 0 0
    2 3  8 0 0
    2 4  9 1 4
    2 5  8 0 4
    2 6  7 0 4
    2 7  8 0 4
    2 8  9 0 4
    3 1 12 0 0
    3 2 12 0 0
    3 3 11 0 0
    3 4 13 0 0
    3 5 15 0 0
    3 6 14 1 6
    3 7 16 0 6
    3 8 16 0 6
    4 3 10 0 0
    4 4 10 0 0
    4 5 12 0 0
    4 6 12 0 0
    4 7 13 1 7
    4 8 14 0 7
    5 1 11 0 0
    5 2 13 0 0
    5 3 12 0 0
    5 4 12 0 0
    5 5 11 0 0
    5 8  9 0 0
    end
    Since this is clearly a staggered-adoption setting, I was considering estimators such as Callaway & Sant'Anna (e.g. csdid) or an event-study approach along the lines of Clarke and Tapia-Schythe (2020, eventdd). In fact, the "gvar" variable reported above is constructed to be used as the group variable in csdid. However, my understanding is that many of these implementations either require or strongly prefer a balanced panel structure.

    One idea I considered was constructing an alternative time variable based on the number of survey appearances for each individual. For example, I could keep only respondents observed e.g. 5 times and define a within-individual time index running from 1 to 5, regardless of the underlying survey year. This would create something closer to a balanced panel. However, my concern is that doing so would discard the calendar-time dimension, which seems important to track any time or cohort effects.

    Is there a standard way to handle this kind of unbalanced survey panel within the Callaway-Sant'Anna framework (or related estimators), or would the missing waves and staggered entry create identification problems that require a different approach? Any advice on how to proceed would be greatly appreciated!
    Last edited by Luna Diaz; 29 May 2026, 16:42.

  • #2
    Fernando Rios-Avila's command jwdid handles unbalanced panels without difficulty. In fact, it uses all available information -- unlike differencing approaches such as Callaway-Sant'Anna, which only uses an observation if both time periods are available. If you have covariates x1 ... xK, do the following:

    Code:
    jwdid y x1 ... xK, ivar(id) tvar(time) gvar(cohort) never
    estat event
    estat plot
    estat simple
    If you drop the "never" option, it produces the "lags only" version of the estimator. If you don't have covariates, just drop them from the command.

    Comment


    • #3
      Thank you, Jeff Wooldridge! I really appreciate your suggestion, as I was not aware of the jwdid command. I tried it on my actual dataset and it seems to be working well.

      I will now take a more in-depth look at the literature/documentation, in order to make sure I am not disregarding any assumption.

      Comment


      • #4
        Luna: The regression-based methods rely on the same kind of no anticipation and conditional parallel trends assumption as other estimators, including CS. The doubly robust estimators in csdid have some resilience to functional form, but it doesn't allow violation of PT in levels -- which to me is the most important limitation of linear models whether one uses csdid or jwdid.

        Comment

        Working...
        X