Hey everyone. I would appreciate some assistance with writing the calculation stage of an algorithm that will be quite useful to Stata users. In a recent paper, some authors
we
(see equation 5, page 24). Alright enough abstraction, let's put some meat on these bones shall we with the Basque Dataset. Here, we have one treated unit (The Basque Country) and 16 donor pool units. The unit I calculate the score for below is Catalunya (Unit 10), an autonomous community very similar to the Basque Country.
I think this is the basic code for it......... but, even if so, how would I optimize this for the rest of the 16 control units (i.e, gdpcap1-gdpcap17)? My goal would be to create a score for each control unit. How might I begin tackling this challenge?
repeatedly compute the pairwise standard deviation of the difference in outcome values for each treated unit i paired against a control unit. This exercise is conducted for each time period t across all periods prior to the treatment. We then sort the computed standard deviation vector and seek a group of minimum values before treatment year T such that
minimise the amount of variation or dispersion in the difference in outcome values prior to treatment
Code:
clear * u "http://fmwww.bc.edu/repec/bocode/s/scul_basque.dta", clear qui xtset local lbl: value label `r(panelvar)' loc unit ="Basque Country (Pais Vasco)":`lbl' loc int_time = 1975 qui xtset cls g treat = cond(`r(panelvar)'==`unit' & `r(timevar)' >= `int_time',1,0) keep treat gdp year id drop treat cls reshape wide gdpcap, j(id) i(year) order gdpcap`unit', a(year) /* Donor Selection Algorithm */ keep gdpcap5 gdpcap10 year g diff = gdpcap5-gdpcap10 if year < 1975 g diffsq = (gdpcap5-gdpcap10)^2 if year < 1975 egen sumdiff = total(diff-diffsq) if year < 1975 g sdscore10 = sqrt(sumdiff/year) if year < 1975 cls l if year < 1975
Comment