Hi all, to preface I'm currently using R rather than Stata although my questions are primarily methodological. Please redirect me and apologies if inappropriate!
So I'm using a GSC to find the impact on employment of reducing the eligibility for the living wage from age 25 to 23 in 2021.
Clustered from APS individual data, I have 26 treatment groups (1 for each combination of 13 regions, 2 sex's and 1 age category (23-24)) and 104 control groups (13 regions, 2, sex's and 4 age categories spanning (25-64)). I have 60 time periods, this is quarterly data from 2010 to 2025, so there are 45 pre-treatment periods. My covariates are Employment rate (%), log average hourly wage and bite (share of workers affected by min wage increases calculated yearly) (%).
When I run gsynth with cross validation it says:
"Cross validation cannot be performed since available pre-treatment records of treated units are too few. So set r.cv = 0. Parametric Bootstrap"
and I get a counterfactual plot that looks like this:
I've tried artificially setting the number of factors as 1 - 5 and it doesn't look significantly different. As far as I can tell my treatment groups are not at the extreme's of the trend data:

My questions are:
1) What methods can I use to improve this pre-treatment matching and why has cross validation failed?
2) Is this a weak application of the synthetic control method?
I have read the thread from 2023 Unparallel pre-intervention trends in synthetic control - Statalist But don't think I share any of the same concerns. I'm interested in the method proposed by Jeff Wooldridge but would need more information on how to actually do this.
I'd be grateful for any help/ ideas at all as I'm fairly new to econometrics.
Thanks, Moses
So I'm using a GSC to find the impact on employment of reducing the eligibility for the living wage from age 25 to 23 in 2021.
Clustered from APS individual data, I have 26 treatment groups (1 for each combination of 13 regions, 2 sex's and 1 age category (23-24)) and 104 control groups (13 regions, 2, sex's and 4 age categories spanning (25-64)). I have 60 time periods, this is quarterly data from 2010 to 2025, so there are 45 pre-treatment periods. My covariates are Employment rate (%), log average hourly wage and bite (share of workers affected by min wage increases calculated yearly) (%).
When I run gsynth with cross validation it says:
"Cross validation cannot be performed since available pre-treatment records of treated units are too few. So set r.cv = 0. Parametric Bootstrap"
and I get a counterfactual plot that looks like this:
I've tried artificially setting the number of factors as 1 - 5 and it doesn't look significantly different. As far as I can tell my treatment groups are not at the extreme's of the trend data:
My questions are:
1) What methods can I use to improve this pre-treatment matching and why has cross validation failed?
2) Is this a weak application of the synthetic control method?
I have read the thread from 2023 Unparallel pre-intervention trends in synthetic control - Statalist But don't think I share any of the same concerns. I'm interested in the method proposed by Jeff Wooldridge but would need more information on how to actually do this.
I'd be grateful for any help/ ideas at all as I'm fairly new to econometrics.
Thanks, Moses

Comment