Hello everyone,
In a natural experiment, I'm studying the effect of a treatment that was rolled out to one language group (e.g., English-speakers) but not to others (control). I'm using difference-in-differences with propensity score matching on unit-level characteristics.
My concern is, even after matching on observable unit characteristics, the treatment group has ~2.4x higher baseline DV difference than the control group. This likely reflects audience differences (e.g., the treatment language has a much larger global user base) rather than unit-level differences, or the behavior of the treatment and control groups are inherently different due to cultural differences.
A few specifics:
In a natural experiment, I'm studying the effect of a treatment that was rolled out to one language group (e.g., English-speakers) but not to others (control). I'm using difference-in-differences with propensity score matching on unit-level characteristics.
My concern is, even after matching on observable unit characteristics, the treatment group has ~2.4x higher baseline DV difference than the control group. This likely reflects audience differences (e.g., the treatment language has a much larger global user base) rather than unit-level differences, or the behavior of the treatment and control groups are inherently different due to cultural differences.
A few specifics:
- Units are matched 1:1 on pre-determined characteristics
- I use unit fixed effects + time fixed effects with clustered SEs
- Pre-treatment trends appear roughly parallel in log
- Does DiD still identify the causal effect if the level difference is driven by audience composition rather than unit characteristics, assuming parallel trends hold?
- Should I be concerned that the control group is internally heterogeneous (6 different languages pooled together)?

Comment