Hey everyone. More matrix manipulation today.
After much work, I've finally managed to extract the betas returned by cvlasso, which of course you'll need to follow my code below. The significance of this is as follows: the below code constructs a synthetic Basque Country that did not experience a wave of terrorism post-1975. The LASSO selects the coefficients (in this case, the donor units) and stores them into a matrix. Luckily though, all this is sort of background knowledge and not really necessary. Here's precisely what I want: For myself, and for the users of my command, I want them to be able to see the exact units LASSO uses to construct the synthetic unit, preferably in matrix form. In this case, the units selected are Cataluña, Madrid, Principado De Asturias, and La Rioja, where my command keys on their unique ID as we've denoted them in our dataset.
Here's the code to partly get us to where we should be (note I use greshape from ssc, but this isn't needed, and the regular one can be used)
My desired result is something like this
We already have the beta matrix, which returns
My first instinct told me to create and reshape a temporary dataset via svmat, but this won't work because of the _cons column. Furthermore, Nick Cox once (rather humorously) said that svmat and working with it would
and that it quite generally shouldn't be used.
So to summarize, my question is this: I'd like to create a matrix exclusively of the selected units from the e(betas) matrix, where the rowname is the selected unit/covariate (e.g., GDP: Asturias) and the column reflects the LASSO coefficients/weights. I don't want users to simply have gdp3, gdp10, and so on, I want them to be able to see the precise units contributing to the synthetic control, and having a matrix as I've listed above would be a good way of doing that. How might I begin this?
After much work, I've finally managed to extract the betas returned by cvlasso, which of course you'll need to follow my code below. The significance of this is as follows: the below code constructs a synthetic Basque Country that did not experience a wave of terrorism post-1975. The LASSO selects the coefficients (in this case, the donor units) and stores them into a matrix. Luckily though, all this is sort of background knowledge and not really necessary. Here's precisely what I want: For myself, and for the users of my command, I want them to be able to see the exact units LASSO uses to construct the synthetic unit, preferably in matrix form. In this case, the units selected are Cataluña, Madrid, Principado De Asturias, and La Rioja, where my command keys on their unique ID as we've denoted them in our dataset.
Here's the code to partly get us to where we should be (note I use greshape from ssc, but this isn't needed, and the regular one can be used)
Code:
clear * qui { *import delim "https://raw.githubusercontent.com/SucreRouge/synth_control/master/basque.csv", clear u "http://econ.korea.ac.kr/~chirokhan/panelbook/data/basque-clean.dta", clear replace regionname = "Asturias" if regionname=="Principado De Asturias" loc int_time = 1975 loc lambda lopt //sysuse basque, clear g treated = cond(regionno==17 & year >= `int_time',1,0) labvars year gdpcap "Year" "ln(GDP per 100,000)" replace regionname = trim(regexr(regionname,"\(.+\) *","")) egen id = group(regionname), label(regionname) // makes a unique ID order id, b(year) *keep if year >= 1960 drop if inlist(id,18) //12 keep gdp id year xtset id year, y cls preserve greshape wide gdp, j(id) i( year) tsset year, y order gdpcap5, a(year) qui cvlasso gdpcap5 gdpcap1-gdpcap17 if year < `int_time', h(1) roll postres qui cvlasso, `lambda' postres // get weights qui predict cf, xb keep year gdpcap5 cf greshape long gdpcap, i(year) j(id) sa Basque, replace restore } mat l e(beta)
Code:
---------------------------------- Co_No | Unit_Weight ---------------------+------------ Asturias | .10749576 Cataluna |.58400395 Madrid | .030129 Rioja | .24480456 ----------------------------------
Code:
mat l e(beta) e(beta)[1,5] gdpcap3 gdpcap10 gdpcap14 gdpcap17 _cons gdpcap5 .10749576 .58400395 .030129 .24480456 .74178677
be like adding an engine to a donkey
So to summarize, my question is this: I'd like to create a matrix exclusively of the selected units from the e(betas) matrix, where the rowname is the selected unit/covariate (e.g., GDP: Asturias) and the column reflects the LASSO coefficients/weights. I don't want users to simply have gdp3, gdp10, and so on, I want them to be able to see the precise units contributing to the synthetic control, and having a matrix as I've listed above would be a good way of doing that. How might I begin this?
Comment