Possible to use synth (SCM) with only pre-intervention data?

Tom Scott

Join Date: Apr 2019

Posts: 266
#1

Possible to use synth (SCM) with only pre-intervention data?

15 Feb 2022, 09:18

Hello,

I am using the synthetic control method with 1 city police department as my treated unit, a donor pool of 200 city police departments, and the violent crime rate as my outcome. The intervention I am looking at occurred in October 2018 and I am using quarterly data on my outcome. Long story short, because the FBI changed how they accept and publish law enforcement agency crime data, I have crime data for all agencies prior to the intervention year but data for only a small subset of them after the intervention year. My plan is to collect data on my outcome for the post-intervention years by going to the police agency websites, but I am hoping to narrow down how many websites I have to go to by running the SCM using the pre-intervention data and then only collecting data for the post-intervention years from the agencies that contribute to the synthetic control. I imagine another purpose of running a synthetic control analysis with only data on the outcome during the pre-intervention period is for study pre-registration purposes, so I thought maybe someone had experience with this. Does anyone know of a way to estimate the synthetic control without post-intervention data points?

Thanks for your time and assistance!
Tags: None
Tom Scott

Join Date: Apr 2019

Posts: 266
#2

15 Feb 2022, 09:47

For interested persons, I found the solution. It is to specify only the preintervention period using the resultsperiod(start(1)end) option
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2172
#3

15 Feb 2022, 17:54

Well, the beauty of SCM is that you don't NEED post intervention data to spit out a counterfactual. The synthetic weights created by the optimization problem models the pre-intervention fit as well as the post intervention fit without needing podt policy data, this is essentially a prediction/forecasting problem.

Word of warning: you likely have high dimensional data, and so classic SCM may match on noise. Look up allsynth by Justin Wiltshire.

My synthetic control command, still being developed, literally looks at it as a convex optimization problem solved with penalized regression, the LASSO or Ridge. Under the hood, it uses regression to simply predict the counterfactual. I haven't tested it yet, but it's plausible it would work with totally missing data in the donor units.
Comment
Stephen Jenkins

Join Date: Apr 2014

Posts: 1480
#4

16 Feb 2022, 02:29

FWIW, I found -allsynth- materials at https://justinwiltshire.com/research-1. One of my PhD students and I will be intereseted to learn about your command, Jared, when it's ready to go public. We'd also welcome information about any Stata implementations of 'synthetic differences in differences' (only currently in R, I think:https://github.com/synth-inference/synthdid )
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2172
#5

29 Apr 2022, 20:15

Stephen Jenkins Tom Scott I've made quite a lot of progress on the synthetic control model I spoke of. I've drafted an article for Stata Journal describing it. I'd be more than happy to send it to you both, if either of you'd be interested in reading it. Also, SDID was recently developed
1 like
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2172
#6

05 Jun 2022, 06:30

Stephen Jenkins While I'm not ready to make the code public just yet (one or two more features to program and other housekeeping matters), the paper describing the estimator is pretty much done.

Would you like to read it?
Comment

Announcement

Possible to use synth (SCM) with only pre-intervention data?

Comment

Comment

Comment

Comment

Comment