  • Bootstrapping a two-step estimation with survey data

    Hello everyone,

    This is more of a clarification question as my code runs without any errors:

    I am working with NHANES data with the following characteristics:

    . svydescribe
    Survey: Describing stage 1 sampling units
          pweight: wtint2yr
              VCE: linearized
      Single unit: centered
         Strata 1: sdmvstra
             SU 1: sdmvpsu
            FPC 1: <zero>

    outcome= outcome variable
    cov1=covariate 1
    cov2=covariate 2
    cov3=covariate 3
    cov4=covariate 4

    I have the following simple two step estimation procedure:

    capture noisily program drop twostep
    global probitcovariates cov2 cov3 cov4
    program twostep, rclass
    args y x covariates
    tempvar touse
    gen byte `touse' = 1
    *First Step
    svy: probit `x' $probitcovariates
    predict xd2h
    gen phi2h = normalden(xd2h)
    gen PHI2h = normal(xd2h)
    gen gr2 = `x'*phi2h/PHI2h - (1 - `x')*phi2h/(1 - PHI2h)
    gen `x'_gr2= `x'*gr2
    *Second Step
    svy:reg `y' `x' `covariates' `x'_gr2 gr2 
    *Dropping Variables w/ gr
    drop xd2h phi2h PHI2h gr2 `x'_gr2
    Note that I am using svy: estimation command in both the steps.

    I want to bootstrap my final estimates to account for the first step estimation. My code is

    global covariates cov2 cov3
    local bootreps = 100
    bootstrap _b, reps(`bootreps') seed(123): twostep outcome cov1 "$covariates"
    My code smoothly without any errors. However, since I am using survey data, I wanted to make sure that even though I am using svy at each step, do I still need
    svy bootstrap
    again in the final command?

    Thank you so much for your time and consideration.
