Hi everyone,
I am hoping to get some advice on the use of boostrap with svy procedures. Below are two contrived examples using public use nhanes data. I have a more complicated multi-step procedure that I am actually trying to program. Can anyone explain whether using bootstrap with svy commands as I do in example 1 is problematic? If so, is example 2 the way to go? Thanks in advance. I am using STATA 14.1 MP on a Linux operating system.
. **EXAMPLE 1: USING BOOSTRAP WITH SVY PROCEDURE AND NO POSTESTIMATION
. **EXAMPLE 2: USING SVY BOOSTRAP WITH REGULAR PROCEDURES AND NO POSTESTIMATION
.
Example 1 produces the following:
Bootstrap replications (5)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
.....
Survey: Logistic regression
Number of strata = 31 Number of obs = 10,351
Replications = 5
Wald chi2(4) = .
Prob > chi2 = .
------------------------------------------------------------------------------
| Observed Bootstrap Normal-based
obese | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
race |
Black | 2.109698 .1831897 8.60 0.000 1.779544 2.501106
Other | .7850331 .0386949 -4.91 0.000 .7127407 .8646579
|
age | 1.018213 .0012068 15.23 0.000 1.015851 1.020581
1.rural | 1.248806 .0253599 10.94 0.000 1.200077 1.299512
|
region |
MW | 1.004325 .142525 0.03 0.976 .7604652 1.326385
S | 1.00868 .1456812 0.06 0.952 .760005 1.338722
W | .9234463 .1506406 -0.49 0.625 .670743 1.271356
|
_cons | .0656821 .0086588 -20.66 0.000 .0507264 .0850472
------------------------------------------------------------------------------
Example 2 produces the following:
Bootstrap replications (5)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
.....
Survey: Logistic regression Number of obs = 10,351
Population size = 117,157,513
Replications = 5
Wald chi2(4) = .
Prob > chi2 = .
------------------------------------------------------------------------------
| Observed Bootstrap Normal-based
obese | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
race |
Black | 2.109698 .2431076 6.48 0.000 1.683191 2.644278
Other | .7850331 .1873516 -1.01 0.311 .4917506 1.25323
|
age | 1.018213 .0015534 11.83 0.000 1.015173 1.021262
1.rural | 1.248806 .08302 3.34 0.001 1.096244 1.422598
|
region |
MW | 1.004325 .1187691 0.04 0.971 .7965506 1.266297
S | 1.00868 .1738644 0.05 0.960 .7195042 1.414079
W | .9234463 .0985782 -0.75 0.456 .7491099 1.138355
|
_cons | .0656821 .0098427 -18.17 0.000 .0489656 .0881054
------------------------------------------------------------------------------
I am hoping to get some advice on the use of boostrap with svy procedures. Below are two contrived examples using public use nhanes data. I have a more complicated multi-step procedure that I am actually trying to program. Can anyone explain whether using bootstrap with svy commands as I do in example 1 is problematic? If so, is example 2 the way to go? Thanks in advance. I am using STATA 14.1 MP on a Linux operating system.
. **EXAMPLE 1: USING BOOSTRAP WITH SVY PROCEDURE AND NO POSTESTIMATION
Code:
webuse nhanes2 . gen obese=0 . replace obese=1 if(bmi>=30) . gen xrace=race . svyset psu [pw=finalwgt], strata(strata) . capture program drop myprog . program define myprog, eclass 1. preserve 2. svy: logistic obese i.race age i.rural i.region 3. restore 4. end . bootstrap _b, seed(10209) reps(5): myprog
.
Code:
clear . webuse nhanes2 . gen obese=0 . replace obese=1 if(bmi>=30) . gen xrace=race . svyset psu [pw=finalwgt], strata(strata) . bsweights bw, reps(5) n(0) seed(10209) . svyset [pw=finalwgt], bsrweight(bw*) vce(bootstrap) . svy bootstrap _b: logistic obese i.race age i.rural i.region
Example 1 produces the following:
Bootstrap replications (5)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
.....
Survey: Logistic regression
Number of strata = 31 Number of obs = 10,351
Replications = 5
Wald chi2(4) = .
Prob > chi2 = .
------------------------------------------------------------------------------
| Observed Bootstrap Normal-based
obese | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
race |
Black | 2.109698 .1831897 8.60 0.000 1.779544 2.501106
Other | .7850331 .0386949 -4.91 0.000 .7127407 .8646579
|
age | 1.018213 .0012068 15.23 0.000 1.015851 1.020581
1.rural | 1.248806 .0253599 10.94 0.000 1.200077 1.299512
|
region |
MW | 1.004325 .142525 0.03 0.976 .7604652 1.326385
S | 1.00868 .1456812 0.06 0.952 .760005 1.338722
W | .9234463 .1506406 -0.49 0.625 .670743 1.271356
|
_cons | .0656821 .0086588 -20.66 0.000 .0507264 .0850472
------------------------------------------------------------------------------
Example 2 produces the following:
Bootstrap replications (5)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
.....
Survey: Logistic regression Number of obs = 10,351
Population size = 117,157,513
Replications = 5
Wald chi2(4) = .
Prob > chi2 = .
------------------------------------------------------------------------------
| Observed Bootstrap Normal-based
obese | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
race |
Black | 2.109698 .2431076 6.48 0.000 1.683191 2.644278
Other | .7850331 .1873516 -1.01 0.311 .4917506 1.25323
|
age | 1.018213 .0015534 11.83 0.000 1.015173 1.021262
1.rural | 1.248806 .08302 3.34 0.001 1.096244 1.422598
|
region |
MW | 1.004325 .1187691 0.04 0.971 .7965506 1.266297
S | 1.00868 .1738644 0.05 0.960 .7195042 1.414079
W | .9234463 .0985782 -0.75 0.456 .7491099 1.138355
|
_cons | .0656821 .0098427 -18.17 0.000 .0489656 .0881054
------------------------------------------------------------------------------
Comment