bootstrap command :insufficient observations to compute bootstrap standard errors no results will be save

carmen armas

Join Date: Jun 2015

Posts: 6
#1

bootstrap command :insufficient observations to compute bootstrap standard errors no results will be save

26 May 2016, 16:14

Hello,
I write this program

1) I set dataset:
clear all
set obs 1000
set seed 12
gen x=runiform()
gen z=runiform()
gen e=rnormal(0,1)
gen y=0.5*x+e
replace x=. in 1/500
replace z=. in -500/-1
2) I write this program:
capture noisily program drop sim
program define sim, rclass
reg y x
local media1=_b[x]
reg y z
local media2=_b[z]
return scalar media = `media1' + `media2'
end
3) I use bootstrap command
bootstrap mean=r(media), reps(100): sim

But, i got this error:

"insufficient observations to compute bootstrap standard errors
no results will be saved"

I dont know if i can use bootstrap command with variables without observations in common like x and z
I did this because this is the problem that i have with my real data and i need to know this is possible.
If somebody could help me, i would appreciate it
thanks!
Tags: None

Clyde Schechter

Join Date: Apr 2014
Posts: 30095

26 May 2016, 16:42

Yes, this is an obscure problem that has tripped up even some of the most senior participants in this Forum.

Your program sim selects a subset of the observations for its regressions. For reasons that I do not understand, when that happens inside of bootstrap, that selection process ends up being retained for the entire execution. And the problem is that once the sample for -reg y x- is selected, given the way your data are constructed, the same sample is used for -reg y z-, which is bad because z is missing whenever x is not. This behavior of -bootstrap- can be suppressed with its -nodrop- option.

Code:

. clear*

. set obs 1000
number of observations (_N) was 0, now 1,000

. set seed 12

. gen x=runiform()

. gen z=runiform()

. gen e=rnormal(0,1)

. gen y=0.5*x+e

. replace x=. in 1/500
(500 real changes made, 500 to missing)

. replace z=. in -500/-1
(500 real changes made, 500 to missing)

. 
. capture noisily program drop sim
program sim not found

. program define sim, rclass
  1. reg y x
  2. local media1=_b[x]
  3. reg y z
  4. local media2=_b[z]
  5. return scalar media = `media1' + `media2'
  6. end

. 
. 
. bootstrap mean=r(media), reps(100) nodrop: sim
(running sim on estimation sample)

Bootstrap replications (100)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
..................................................    50
..................................................   100

Bootstrap results                               Number of obs     =      1,000
                                                Replications      =        100

      command:  sim
         mean:  r(media)

------------------------------------------------------------------------------
             |   Observed   Bootstrap                         Normal-based
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        mean |   .3479952   .2404872     1.45   0.148    -.1233511    .8193415
------------------------------------------------------------------------------

In the future, please post all code and output in a code block, as I have done here. It makes things easier to read. For instructions on this and other aspects of effective posting, please read FAQ #12.

Comment

Steve Samuels

Join Date: Mar 2014

Posts: 1786
#3

26 May 2016, 18:09

I was bitten by this as described here, and Isabelle Canette of StataCorp explained that the problem of dropped observations occurs only with estimation class commands and is cured by adding -rdrop-, as Clyde demonstrates. I think it has to do with the e(sample) function which identifies the estimation sample. Although Carmen's program is declared as r(class), inside it is an estimation command: -regress. Replace the -regress- calls an r(class) command like -corr- and the bootstrap would have run without problem.

Code:

program define sim, rclass corr y x local media1= r(rho) corr y z local media2= r(rho) return scalar media = `media1' + `media2' end bootstrap mean=r(media), reps(100): sim

Last edited by Steve Samuels; 26 May 2016, 18:22.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment
Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2402
#4

21 Mar 2018, 11:56

I recently ran into this issue with a simple rclass program to bootstrap the c-statistic from a logistic regression model. Two ways around it were:

1) Perform the logistic regression calculation of the linear predictor manually (if relying on previously obtained estimates that are not computed during the bootstrap program)
2) Clear estimation results using -- ereturn clear --
2 likes
Comment
Samuel Arispe

Join Date: Sep 2021

Posts: 1
#5

09 Sep 2021, 12:59

Hi Clyde Schechter, I am trying to do you mention but I continue with the same problem.

My command is the following:

reg var1 country_pol_trust_mean p_2country_pol_trust_mean ///
`controls2' i.time_d i.country_region [pw=reweight_region], vce(bootstrap, ///
cl(country_region) reps(10) noisily nodrop force seed(101010))

I thing this happens for the numbers of region that my data has (282) for the fixed effect that I want to capture. I don't know.
¿Could you help me, please?
Comment
Rolando Gonzales

Join Date: Aug 2019

Posts: 13
#6

24 Nov 2022, 11:15

I am arriving 6 years late to the party but in my case I found a simple solution that works by using preserve and restore:

Code:

program nameofprogramhere, preserve .... reg y X ... ... restore end program bootstrap, reps(100): nameofprogramhere

Does it have sense?
1 like
Comment
Abdullah Algarni

Join Date: Jul 2022

Posts: 66
#7

28 Mar 2023, 07:22

Originally posted by Leonardo Guizzetti View Post

I recently ran into this issue with a simple rclass program to bootstrap the c-statistic from a logistic regression model. Two ways around it were:

1) Perform the logistic regression calculation of the linear predictor manually (if relying on previously obtained estimates that are not computed during the bootstrap program)
2) Clear estimation results using -- ereturn clear --

That's work with me!

Thank you

Sincerely regards,
Abdullah Algarni
[email protected]
Comment
Johanna Krenz

Join Date: Oct 2022

Posts: 33
#8

16 May 2024, 09:02

Originally posted by Rolando Gonzales View Post

I am arriving 6 years late to the party but in my case I found a simple solution that works by using preserve and restore:

Code:

program nameofprogramhere, preserve .... reg y X ... ... restore end program bootstrap, reps(100): nameofprogramhere

Does it have sense?

This works for me too. Could anyone explain to me why? I.e. what problem is fixed with that?
Comment

Announcement