Bootstrapping a user-written program

Briana Sullivan

Join Date: Mar 2016
Posts: 3

Bootstrapping a user-written program

07 Mar 2016, 23:47

Hello,

Using Stata 13.1, I am running a Heckman two-step regression. I am bootstrapping the difference of two fitted values. (The original data has synthetic set to 1. I then collapse the data by sex and set synthetic equal to 3. I am finding the difference between the fitted values with an associated synthetic value of 3). When I run the bootstrap command I receive the following error message: "insufficient observations to compute bootstrap standard errors." The program works when I run just the OLS (or second stage regression) and crashes once I add the probit. Additionally, the program itself works and can be run multiple times without generating errors.

Because I wanted to make the code more accessible, I made a version using auto.dta. Surprisingly, I am able to bootstrap the program; yet I am not able to distinguish the differences in the code that may be driving the errors in my initial program.

Code:

use MR_work/mr_datasets/temp_all.dta, clear
* regressors*
global regs1 "widowed divorced separated never_married midwest south west hsd08 hsd911 hsg cg ad pot_exp1 pot_exp2 pot_exp3 pot_exp4 hsd08_pot_exp1 hsd08_pot_exp2 hsd08_pot_exp3 hsd08_pot_exp4 hsd911_pot_exp1 hsd911_pot_exp2 hsd911_pot_exp3 hsd911_pot_exp4 hsg_pot_exp1 hsg_pot_exp2 hsg_pot_exp3 hsg_pot_exp4 cg_pot_exp1 cg_pot_exp2 cg_pot_exp3 cg_pot_exp4 ad_pot_exp1 ad_pot_exp2 ad_pot_exp3 ad_pot_exp4"
global probit_regs " __child06 child06_widowed child06_divorced child06_separated child06_never_married"
global regs2 "sex*"

**************************
*Define bootstrap program*
**************************
capture prog drop bias70s90s
prog define bias70s90s, rclass
set more off

append using MR_work/mr_datasets/tempV2.dta //add 2 observations with synthetic==3
probit ftfy $regs1 $probit_regs if sex==1 & timepd==0 & synthetic==1 [pweight=wgt]
tempvar lambda_input lambda_female
predict `lambda_input' if sex==1, xb
gen `lambda_female' = normalden(-`lambda_input') / (1 - normal(-`lambda_input')) if timepd==0
replace `lambda_female'=0 if sex==0
replace `lambda_female'=0 if synthetic>1

reg log_hourly_wage $regs1 $regs2 `lambda_female' if timepd==0 & wagesmpl==1 [pw=wgt], robust
tempvar xb
predict `xb'
sum `xb' if sex==1 & synthetic==3
local m1=r(mean)
sum `xb' if sex==0 & synthetic==3
local m2=r(mean)
return scalar ols0_fix=`m1'-`m2'


drop if synthetic>1
end

** end bootstrap loop


*Run bootstrap command*
bootstrap r(ols0_fix), reps(10) seed(123) strata(timepd): bias70s90s

Code:

sysuse auto, clear
gen synthetic=1
global regs1 "mpg"
global regs2 "weight"
capture prog drop bias70s90s

**************************
*Define bootstrap program*
**************************
prog define bias70s90s, rclass
set more off

append using "C:\Users\Briana\Desktop\Temporary\auto_temp.dta" //created by collapse, by(foreign)
probit foreign $regs2 if synthetic==1
tempvar lambda_input lambda_female
predict `lambda_input', xb
gen `lambda_female' = normalden(-`lambda_input') / (1 - normal(-`lambda_input'))
replace `lambda_female'=0 if synthetic==2

reg displacement $regs1 `lambda_female' if synthetic==1, robust
tempvar xb
predict `xb'

sum `xb' if foreign==1 & synthetic==2
local m1=r(mean)
sum `xb' if foreign==0 & synthetic==2
local m2=r(mean)
return scalar ols0_fix=`m1'-`m2'

drop if synthetic>1
end

** end bootstrap loop


*Run bootstrap command*
bootstrap r(ols0_fix), reps(10) seed(123): bias70s90s

Thank you!

Last edited by Briana Sullivan; 08 Mar 2016, 00:21.

Tags: None

Damian Tojeiro

Join Date: Mar 2016

Posts: 86
#2

08 Mar 2016, 02:09

Hi Briana, I have almost the same problem, but still not fixed. Really hope that someone can help you and see if that allows me to take advantage and fix mine too. In my code I think the problems comes from the subsample in the main regression, I have tryied without the subsample and then it works. But I need the main equation and also the lambdas for the subsample, so.... Lets see if someone help us on this. Only for if you want to take a look: http://www.statalist.org/forums/foru...eckman-process
Comment
Briana Sullivan

Join Date: Mar 2016

Posts: 3
#3

08 Mar 2016, 22:33

I wrote my own bootstrap, and I am able to successfully run the code. I used: http://www.ats.ucla.edu/stat/stata/faq/ownboot.htm as a template.
Hope it helps!
Comment
Damian Tojeiro

Join Date: Mar 2016

Posts: 86
#4

09 Mar 2016, 03:08

Thanks Briana, I have been trying with that page you say me, but it seems that I´m not too smart with Stata because I´m unable to extrapolate that into my program. Can you put an example of your code now that you say is working? That would help a lot.
Thanks in advance.
Comment

Briana Sullivan

Join Date: Mar 2016
Posts: 3

10 Mar 2016, 00:56

Hi Damian,

Below is a version of my code. I've tried to add some explanations within the code itself.

Code:

use MR_work/mr_datasets/temp_all.dta, clear


*STEP 1 - Run the regressions and save the results (this seems to be a necessary step)
quietly{
append using MR_work/mr_datasets/tempV2.dta

forval i=0/1{
probit ftfy $regs1 $probit_regs if sex==1 & timepd==`i' & synthetic==1 [pweight=wgt] //My regressors are globals
cap drop lambda_input lambda_female
predict lambda_input if sex==1, xb
gen lambda_female = normalden(-lambda_input) / (1 - normal(-lambda_input)) if timepd==`i'
replace lambda_female=0 if sex==0
replace lambda_female=0 if synthetic>1

reg log_hourly_wage $regs1_nomarst $regs2_nomarst lambda_female if timepd==`i' & wagesmpl==1 [pw=wgt], robust
cap drop xb
predict xb

sum xb if sex==1 & synthetic==3
scalar m1=r(mean)
sum xb if sex==0 & synthetic==3
scalar m2=r(mean)
scalar nomarst`i'_heck=m1-m2 //wage difference calculated using fixed synthetics
}
}
matrix diff=( nomarst0_heck, nomarst1_heck)
matrix list diff
drop if synthetic>1


*STEP 2 - Create the program
cap program drop myboot
program define myboot, rclass
    preserve //preserve the data
    bsample, strata(timepd) //samples the data in memory with replacement
*Repeat the regressions from above, using tempvars and return scalar in the program
append using MR_work/mr_datasets/tempV2.dta


forval i=0/1{
probit ftfy $regs1 $probit_regs if sex==1 & timepd==`i' & synthetic==1 [pweight=wgt]
tempvar lambda_input lambda_female
predict `lambda_input' if sex==1, xb
gen `lambda_female' = normalden(-`lambda_input') / (1 - normal(-`lambda_input')) if timepd==`i'
replace `lambda_female'=0 if sex==0
replace `lambda_female'=0 if synthetic>1


reg log_hourly_wage $regs1_nomarst $regs2_nomarst `lambda_female' if timepd==`i' & wagesmpl==1 [pw=wgt], robust
tempvar xb
predict `xb'

*I gave the scalars different names compared to step 1
sum `xb' if sex==1 & synthetic==3
local m1=r(mean)
sum `xb' if sex==0 & synthetic==3
local m2=r(mean)
local nomarst0`i'_heck=`m1'-`m2'
return scalar nomarst0`i'_heck=`m1'-`m2' //wage difference calculated using fixed synthetics
}

drop if synthetic>1
    restore //return data to original state
end

*STEP 3 - use the scalars that are created by the program
simulate r(nomarst00_heck) r(nomarst01_heck), reps(1000) seed(123): myboot


*STEP 4 - use results stored in Step 1. If you have more than one statistic that you're bootstrapping you'll need to make a matrix of scalars in step one (this is how I created diff)
bstat, stat(diff)

Comment

Damian Tojeiro

Join Date: Mar 2016

Posts: 86
#6

10 Mar 2016, 01:50

Hi Briana, thanks a lot for your help, I will start right now with this. So, first we need to generate the estimations outside the program and store the coefficients (in my case) in a matrix, and then do the same inside the program and return the coefficients in scalar. After that I simulate the scalars (my coefficients) in the program, and finally use the results store in the matrix created in the first step. Ok, I will try right now.
Again, thanks a lot for your help.
Comment
Dereje Regasa

Join Date: Jun 2016

Posts: 3
#7

27 Sep 2016, 19:37

Dear Statalisters,
I am using fractional logit (in the first stage regression) in Stata 14 to estimate the predicted values of (X1) and the second stage would be OLS. Anyone who can suggest me how to write my bootstrap program for standard errors.
Thanks
Comment

Announcement

Bootstrapping a user-written program

Comment

Comment

Comment

Comment

Comment

Comment