Bootstrapping a 3-step regression program - insufficient observation error on later iterations

Raahil Madhok

Join Date: Jan 2019
Posts: 9

Bootstrapping a 3-step regression program - insufficient observation error on later iterations

21 Jan 2019, 12:36

I wrote a program that: (1) runs LASSO (with the elasticregress command) on a set of 52 instruments (with prefix psc_) to select the best ones; (2) regresses an endogenous variable capgw_dist_cum on the selected instruments (first stage) with reghdfe using district, month, and state-year fixed effects and; (3) regresses predicted values from (2) on the dependent variable of interest with reghdfe and the same fixed effects. I manually ran the program a few times and it works fine, but when I bootstrap the program, I get the error "insufficient observations to compute bootstrap standard errors no results will be saved" around the 10th draw (earlier iterations are fine). Note that I eststo the bootstrap because I want to store and tabulate the final regression (second stage) with bootstrapped standard errors later on. How come it starts off working and breaks down later from insufficient observations? Is it a problem with my program or data structure? Thanks

Code:

capture program drop ivreg_lassoprogram define ivreg_lasso
syntax varlist(numeric ts fv)

 //1. Parse Equation

** Dependent Var
local depvar `1'
di "`depvar'"

** Controls
local xcontrols : list varlist - depvar
di "`xcontrols'"

//2. Zero Stage

** Residualize on FE
reghdfe capgw_dist_cum capgw_state_cum_pred, ///
a(c_code_2001_num month state_code_2001_num#year) ///
vce(cl state_code_2001_num#year) residuals(resid_capgw_dist_cum)

** Elastic Net (LASSO)
elasticregress resid_capgw_dist_cum psc_*

** Collect Model Vars
file open varlistfile using "${DATA}/instrument/varlist_nonzero.txt", write replace
file write varlistfile "`e(varlist_nonzero)'"
file close varlistfile

file open myfile using "${DATA}/instrument/varlist_nonzero.txt", read
file read myfile model_vars
file close myfile
di "`model_vars'"

//3. First Stage

** First Stage
reghdfe capgw_dist_cum `model_vars' `xcontrols', ///
a(c_code_2001_num month state_code_2001_num#year, savefe) ///
vce(cl state_code_2001_num#year, suite(mwc)) old

** Predicted Values
ren capgw_dist_cum capgw_dist_cum_o
predict capgw_dist_cum, xbd
la var capgw_dist_cum "Cum. District Capacity (GW)"

// 4. Second Stage

eststo: reghdfe `depvar' capgw_dist_cum `xcontrols', ///
a(c_code_2001_num month state_code_2001_num#year) ///
vce(cl state_code_2001_num#year) old
 test _b[capgw_dist_cum] = 0
estadd scalar fsf `e(F)', replace
estadd scalar p = r(p), replace
*estadd scalar jp `e(jp)', replace
estadd scalar nclust `e(N_clust)'
estadd local dist_fe "$\checkmark$"
estadd local month_fe "$\checkmark$"
estadd local st_y_fe "$\checkmark$"
estadd local clust "S $\times$ Y"
estadd local weather "$\checkmark$" 
drop capgw_dist_cum resid_capgw_dist_cum
ren capgw_dist_cum_o capgw_dist_cumend

local weather temperature_mean rainfall_mean
local hh_controls bpl clean_cookstove pucca lighting_elec
set seed 99999
eststo: bootstrap, rep(100) nodrop : ivreg_lasso y_var `hh_controls' age_ew mother_educated religion_* caste_* `weather' capgw_state_cum_pred

Last edited by Raahil Madhok; 21 Jan 2019, 12:39.

Tags: bootstrap, panel

Raahil Madhok

Join Date: Jan 2019

Posts: 9
#2

22 Jan 2019, 13:52

Update: I did some troubleshooting to narrow down the issue. I ran ivreg_lasso on its own multiple times and it works fine. I ran it with bootstrap and 5 reps with the noisily option and each part of the program runs perfectly fine -- only after the last iteration does it break down and give the insufficient observation error. In other words, the program runs fine on each bootstrap sample, but it does not calculate the bootstrapped standard errors across these samples in the final step. Any idea what's going on here? Thanks
Comment
Joro Kolev

Join Date: Aug 2018

Posts: 3050
#3

22 Jan 2019, 15:46

It does not sound to me like the bootstrap has succeeded on 5 rounds. To me the error you receive sounds like the programme executes once, and then on the second round fails. I think technically you can calculate standard error from 2 observations.

If you want paste exactly what Stata returns to you when you execute the bootstrap.
Comment
Achim Ahrens

Join Date: Jun 2014

Posts: 49
#4

23 Jan 2019, 03:12

This is not direct answer to your question, but might be of interest anyway:

If you want to select instruments using the lasso, you can also try the program ivlasso, which is part of the pdslasso package (that I co-authored with Mark Schaffer and Chris Hansen). We have implemented the IV-Lasso approach of Belloni, Chernozhukov, Chen & Hansen (2012, Econometrica, https://doi.org/10.3982/ECTA9626). Since the approach relies on theory-driven penalization (rather than cross-validation as you are doing it), there is no need for boot-strapping standard errors. The approach also works for panel data.

If you indeed want to use lasso with cross-validation, you might also want to consider the sample splitting approach in Chernozhukov et al (2018, The Econometrics Journal, https://doi.org/10.1111/ectj.12097).

For more info, check the help file of ivlasso, and https://statalasso.github.io/

--
Tag me or email me for ddml/pdslasso/lassopack/pystacked related questions. I don't check Statalist.
1 like
Comment
Raahil Madhok

Join Date: Jan 2019

Posts: 9
#5

25 Jan 2019, 11:27

Thanks Achim. I was actually reading this paper and your associated stata package this week and was going to email you guys anyway! Is there a way to use elastic net in the ivlasso command? My set of instruments are highly correlated within groups, and I have read that elastic net is preferred in this case. I read through the help file but it seems only rlasso options can be called. Is there a way to set lasso2 options so I can set the elastic net alpha-parameter?
Comment
Achim Ahrens

Join Date: Jun 2014

Posts: 49
#6

26 Jan 2019, 04:01

There is currently no way of using the elastic net with ivlasso. ivlasso relies on the rigorous ("theory-driven") penalization approach (used in rlasso), and the theory of rigorous penalization has not been developed for the elastic net (to my knowledge anyway).

While you are right that the elastic net is generally preferred in the presence of highly correlated predictors (here: instruments), I would expect the rigorous lasso to perform well in this setting despite highly correlated instruments.

Suppose you have two relevant instruments, z1 and z2, which are highly correlated plus a bunch of irrelevant intruments. The lasso will tend to pick only one of the two, z1 or z2. But that might not be so bad. You don't really care about interpreting the effect of z1 and z2 on your endogenous regressor in a causal way. You only want to predict your optimal instrument well, and since z1 and z2 contain virtually the same information, one instrument might well be enough.

Also note that the conditions for the rigorous lasso to work (restricted eigenvalue condition) allow for a high degree of correlation among predictors.

If you want to use the elastic net for generating your instrument, I think you would have to use cross-validation and sample splitting as in the paper cited above.

Is there a way to set lasso2 options so I can set the elastic net alpha-parameter?

To set the elastic net parameter in lasso2, you simply use the alpha() option.

--
Tag me or email me for ddml/pdslasso/lassopack/pystacked related questions. I don't check Statalist.
Comment
Mark Schaffer

Join Date: Mar 2014

Posts: 324
#7

26 Jan 2019, 04:57

A small extra comment in addition to what Achim says - you want to predict your optimal instrument well but not "too well", in the sense that you don't want "too many" selected instruments. This would be a problem if you used cross-validation to select the elastic net tuning parameters without also using the sample-splitting method as described in the paper that Achim pointed to.
Comment
Raahil Madhok

Join Date: Jan 2019

Posts: 9
#8

27 Jan 2019, 18:58

Thanks Mark and Achim. Sounds like ivlasso is the best way forward. What exactly do you mean (intuitively) by "theory-driven" penalization in this context? Pardon my lack of knowledge on the econometric theory behind this -- parts of the Belloni et al. (2012) paper are beyond me.

I understand that cross-validation is often criticized because it is completely data-driven and not grounded in any theory. How does the approach embedded in rlasso differ?

Thanks!
Comment
Achim Ahrens

Join Date: Jun 2014

Posts: 49
#9

28 Jan 2019, 02:30

Have a look at our working paper that explains the difference between cross-validation and rigorous ("theory-driven") penalization: http://ftp.iza.org/dp12081.pdf

Rigorous penalization sets the penalty level as the smallest value that guarantees that the noise of problem (represented by the score vector 2/n X'e) is dominated. In a way, only predictors that add information (relative to the overall noise level) are included.

--
Tag me or email me for ddml/pdslasso/lassopack/pystacked related questions. I don't check Statalist.
Comment
Mark Schaffer

Join Date: Mar 2014

Posts: 324
#10

28 Jan 2019, 15:23

There's also some discussion of cross-validation in the paper, if you're interested. The bit on the theoretical properties of CV is rather brief, though, so if you want more on that, the survey by Arlot and Celisse 2010 (referenced in the paper) is very readable.
Comment

Announcement

Bootstrapping a 3-step regression program - insufficient observation error on later iterations

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment