Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bootstrapping regressions based on subsamples

    Assume I have 1000 observations and I’m running the following regression

    Code:
    reg y x1 x2
    I want to generate a loop that randomly chooses 750 observations from my dataset (without replacement) and run the regression with bootstrapped errors. But I also want to do the same with the 250 observations that were not selected. I want to repeat this 1000 times. At the end, I want to see two regressions with bootstrapped errors, one with randomly selected 750 variables, and one with the rest of the observations. I’m not really sure if this is doable. Can you help me with this?

    Thanks.

  • #2
    I generated an example data for y, x1 and x2 with 1000 observations, constructed a user-defined program implementing the process (randomly split data into 750 vs 250 obs, and regress on both subsamples), and simulated the process for 1000 times.

    Code:
    clear
    set obs 1000
    gen x1 = rnormal()
    gen x2 = rnormal()
    gen y = 1 + 2*x1 + 3*x2 + rnormal()
    
    cap program drop myprog
    program define myprog, eclass
        preserve
        splitsample, generate(sub) split(0.75 0.25)
        reg y x1 x2 if sub == 1
        mat b1 = e(b)
        mat colnames b1 = x1_s x2_s cons_s
        reg y x1 x2 if sub == 2
        mat b2 = e(b)
        mat colnames b2 = x1_ns x2_ns cons_ns
        mat b = b1,b2
        ereturn post b
        restore
    end
    
    simulate _b, reps(1000) nodots: myprog

    Below is the result for the 1000-time simulation. The items with "_s" are coefficients from the selected sample (750 obs) and those with "_ns" are from the 250-obs sample. "Mean" shows the average of coefficients, and "Std. Dev." is essentially the so-called "bootstrapped errors".

    Code:
    . sum, separator(0)
    
        Variable |        Obs        Mean    Std. Dev.       Min        Max
    -------------+---------------------------------------------------------
         _b_x1_s |      1,000    1.997916    .0173457   1.942227   2.049491
         _b_x2_s |      1,000    2.980246    .0176003   2.916817   3.036341
       _b_cons_s |      1,000    1.022663    .0176762    .970253   1.080174
        _b_x1_ns |      1,000    1.998467     .052315   1.827173   2.154136
        _b_x2_ns |      1,000    2.980618     .053068   2.819928   3.140781
      _b_cons_ns |      1,000    1.025215     .053066   .8514135   1.183416

    Comment

    Working...
    X