Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Difference-in-Differences with Multiple Time Periods

    Dear Stata community,

    I am currently working with an unbalanced panel dataset and would greatly appreciate your guidance on implementing a Difference-in-Differences (DiD) approach using the Callaway & Sant’Anna (2020) methodology (via csdid).

    My dataset consists of daily observations from environmental monitoring stations, which measure air pollution indicators such as PM2.5 and PM10. The panel is unbalanced because not all stations report data continuously over time, and in some cases, entire periods are missing for certain pollutants.

    The unit of observation is the monitoring station, and each station is located within a municipality. The treatment corresponds to municipality where a station is located has implemented a environmental policy.

    I define the treatment timing variable (gvar) as the date of the policy implementation in the municipality. In practice, I construct this variable at the unit level (station) so that it remains constant over time within each unit. Specifically, I use the following Stata code:
    bysort id_polution_station: egen gvar_d = min(date_policy_approval) replace gvar_d = 0 if missing(gvar_d) format gvar_d %td
    This ensures that:
    • All observations for a treated unit share the same treatment adoption date.
    • Units that are never treated are assigned gvar = 0.
    Given this setup, I would like to ask the following:
    1. Are there any recommended best practices when dealing with irregular reporting frequency or missing periods in high-frequency environmental data?
    2. In this context, would it be preferable to rely on never-treated units as controls, or should I use the notyet option to include not-yet-treated units as well?
    Any advice on model specification, data preparation, or potential pitfalls would be highly appreciated.

    Thank you very much in advance for your help.

    Best regards,

  • #2
    Can you aggregate to week or month and get rid of some of the missing? csdid can handle unbalanced data, but you want to make sure it's not unbalanced in a way that may cause bias.

    either never/notyet are ok. maybe do both. not yet is useful when you have few observations, but that's unlikely an issue for you.

    Comment

    Working...
    X