Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • bootstrap command :insufficient observations to compute bootstrap standard errors no results will be save

    Hello,
    I write this program

    1) I set dataset:
    ​clear all
    set obs 1000
    set seed 12
    gen x=runiform()
    gen z=runiform()
    gen e=rnormal(0,1)
    gen y=0.5*x+e
    replace x=. in 1/500
    replace z=. in -500/-1
    2) I write this program:
    capture noisily program drop sim
    program define sim, rclass
    reg y x
    local media1=_b[x]
    reg y z
    local media2=_b[z]
    return scalar media = `media1' + `media2'
    end
    3) I use bootstrap command
    bootstrap mean=r(media), reps(100): sim

    But, i got this error:

    "insufficient observations to compute bootstrap standard errors
    no results will be saved"

    I dont know if i can use bootstrap command with variables without observations in common like x and z
    I did this because this is the problem that i have with my real data and i need to know this is possible.
    If somebody could help me, i would appreciate it
    thanks!

  • #2
    Yes, this is an obscure problem that has tripped up even some of the most senior participants in this Forum.

    Your program sim selects a subset of the observations for its regressions. For reasons that I do not understand, when that happens inside of bootstrap, that selection process ends up being retained for the entire execution. And the problem is that once the sample for -reg y x- is selected, given the way your data are constructed, the same sample is used for -reg y z-, which is bad because z is missing whenever x is not. This behavior of -bootstrap- can be suppressed with its -nodrop- option.

    Code:
    . clear*
    
    . set obs 1000
    number of observations (_N) was 0, now 1,000
    
    . set seed 12
    
    . gen x=runiform()
    
    . gen z=runiform()
    
    . gen e=rnormal(0,1)
    
    . gen y=0.5*x+e
    
    . replace x=. in 1/500
    (500 real changes made, 500 to missing)
    
    . replace z=. in -500/-1
    (500 real changes made, 500 to missing)
    
    . 
    . capture noisily program drop sim
    program sim not found
    
    . program define sim, rclass
      1. reg y x
      2. local media1=_b[x]
      3. reg y z
      4. local media2=_b[z]
      5. return scalar media = `media1' + `media2'
      6. end
    
    . 
    . 
    . bootstrap mean=r(media), reps(100) nodrop: sim
    (running sim on estimation sample)
    
    Bootstrap replications (100)
    ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
    ..................................................    50
    ..................................................   100
    
    Bootstrap results                               Number of obs     =      1,000
                                                    Replications      =        100
    
          command:  sim
             mean:  r(media)
    
    ------------------------------------------------------------------------------
                 |   Observed   Bootstrap                         Normal-based
                 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
            mean |   .3479952   .2404872     1.45   0.148    -.1233511    .8193415
    ------------------------------------------------------------------------------
    In the future, please post all code and output in a code block, as I have done here. It makes things easier to read. For instructions on this and other aspects of effective posting, please read FAQ #12.

    Comment


    • #3
      I was bitten by this as described here, and Isabelle Canette of StataCorp explained that the problem of dropped observations occurs only with estimation class commands and is cured by adding -rdrop-, as Clyde demonstrates. I think it has to do with the e(sample) function which identifies the estimation sample. Although Carmen's program is declared as r(class), inside it is an estimation command: -regress. Replace the -regress- calls an r(class) command like -corr- and the bootstrap would have run without problem.
      Code:
      program define sim, rclass
        corr y x
        local media1= r(rho)
         corr y z
        local media2= r(rho)
        return scalar media = `media1' + `media2'
        end
       bootstrap mean=r(media), reps(100): sim
      Last edited by Steve Samuels; 26 May 2016, 18:22.
      Steve Samuels
      Statistical Consulting
      [email protected]

      Stata 14.2

      Comment


      • #4
        I recently ran into this issue with a simple rclass program to bootstrap the c-statistic from a logistic regression model. Two ways around it were:

        1) Perform the logistic regression calculation of the linear predictor manually (if relying on previously obtained estimates that are not computed during the bootstrap program)
        2) Clear estimation results using -- ereturn clear --

        Comment


        • #5
          Hi Clyde Schechter, I am trying to do you mention but I continue with the same problem.

          My command is the following:

          reg var1 country_pol_trust_mean p_2country_pol_trust_mean ///
          `controls2' i.time_d i.country_region [pw=reweight_region], vce(bootstrap, ///
          cl(country_region) reps(10) noisily nodrop force seed(101010))

          I thing this happens for the numbers of region that my data has (282) for the fixed effect that I want to capture. I don't know.
          ¿Could you help me, please?

          Comment


          • #6
            I am arriving 6 years late to the party but in my case I found a simple solution that works by using preserve and restore:

            Code:
            program nameofprogramhere,
            preserve
            ....
            reg y X ...
            ...
            restore
            end program
            
            bootstrap, reps(100): nameofprogramhere
            Does it have sense?

            Comment


            • #7
              Originally posted by Leonardo Guizzetti View Post
              I recently ran into this issue with a simple rclass program to bootstrap the c-statistic from a logistic regression model. Two ways around it were:

              1) Perform the logistic regression calculation of the linear predictor manually (if relying on previously obtained estimates that are not computed during the bootstrap program)
              2) Clear estimation results using -- ereturn clear --
              That's work with me!

              Thank you
              Sincerely regards,
              Abdullah Algarni
              [email protected]

              Comment

              Working...
              X