Hello everyone- this is my first post in the Stata forums, although I'm a regular at Stack Overflow. My question is how I can perform bootstrapping on an adjusted prevalence calculation performed on the Cluster-level which is then added to the other adjusted prevalence calculations for the other Clusters in an Evaluation Unit (EU) and then the EU prevalence is the average of the adjusted cluster prevalences. I am able to successfully perform the prevalence calculation with various collapse commands, but this loses the resident-level data that I would need to perform a bootstrap on the calculations in order to get CIs. Here is the dofile I use to do the prevalence calculations (FYI: The age adjustment is done using manually-entered weights... I don't think there is anyway I can avoid doing that bit of manual work):
insheet using "C:\Users\rwillis\Google Drive\Data Analysis\Stata_Practice_201407\YEMEN_CLEAN_lesstha n00081.csv", comma
drop if examined!=1
gen tf_any=.
replace tf_any=1 if left_eye_tf=="1" | right_eye_tf=="1"
gen tt_any=.
replace tt_any=1 if left_eye_tt=="1" | right_eye_tt=="1"
gen str5 eu_s = string(eu, "%05.0f")
gen str3 cluster_s = string(cluster, "%03.0f")
gen str2 age_s = string(age, "%02.0f")
gen group= eu_s + cluster_s + age_s
gen res_dum=.
replace res_dum=1 if !missing(instance_id_res)
gen tf_dum=.
replace tf_dum=1 if tf_any==1
gen tt_dum=.
replace tt_dum=1 if tt_any==1
save "C:\Users\rwillis\Google Drive\Data Analysis\Stata_Practice_201407\Yemen_uncollapsed_2 0140802.dta",replace
collapse (sum) num_1_9=res_dum tf_1_9=tf_dum if age<10 & age>=1, by(group)
gen tf_prev= tf_1_9/num_1_9
gen age_s=substr(group, 9, 2)
destring age_s, generate(age)
drop age_s
gen age_weight=.
replace age_weight=0.130443886097152 if age==1
replace age_weight=0.125628140703518 if age==2
replace age_weight=0.118058239917536 if age==3
replace age_weight=0.113870635227419 if age==4
replace age_weight=0.110005153975003 if age==5
replace age_weight=0.106123566550702 if age==6
replace age_weight=0.102692951939183 if age==7
replace age_weight=0.098779152171112 if age==8
replace age_weight=0.0943982734183739 if age==9
gen tf_adj_prev= tf_prev* age_weight
save "C:\Users\rwillis\Google Drive\Data Analysis\Stata_Practice_201407\yemen_groupcollapse d_tfprev_20140802.dta"
gen eucluster=substr(group,1,8)
collapse (sum) cluster_tf_prev=tf_adj_prev, by(eucluster)
gen eu=substr(eucluster,1,5)
collapse eu_tf_prev=cluster_tf_prev, by(eu)
save "C:\Users\rwillis\Google Drive\Data Analysis\Stata_Practice_201407\Yemen_EU_Adjprev_20 140802.dta"
So, to summarize:
1. I need code to perform the calculations without collapsing
2. Next, I need help in choosing the correct bootstrap command and putting it in the correct place.
I have asked several colleagues for assistance in this and everyone is either too busy to allow time to wrap their brains around it or just stumped. Please consider this your challenge for the day and help me! :-) OH- and I'm using Stata 10.1.
insheet using "C:\Users\rwillis\Google Drive\Data Analysis\Stata_Practice_201407\YEMEN_CLEAN_lesstha n00081.csv", comma
drop if examined!=1
gen tf_any=.
replace tf_any=1 if left_eye_tf=="1" | right_eye_tf=="1"
gen tt_any=.
replace tt_any=1 if left_eye_tt=="1" | right_eye_tt=="1"
gen str5 eu_s = string(eu, "%05.0f")
gen str3 cluster_s = string(cluster, "%03.0f")
gen str2 age_s = string(age, "%02.0f")
gen group= eu_s + cluster_s + age_s
gen res_dum=.
replace res_dum=1 if !missing(instance_id_res)
gen tf_dum=.
replace tf_dum=1 if tf_any==1
gen tt_dum=.
replace tt_dum=1 if tt_any==1
save "C:\Users\rwillis\Google Drive\Data Analysis\Stata_Practice_201407\Yemen_uncollapsed_2 0140802.dta",replace
collapse (sum) num_1_9=res_dum tf_1_9=tf_dum if age<10 & age>=1, by(group)
gen tf_prev= tf_1_9/num_1_9
gen age_s=substr(group, 9, 2)
destring age_s, generate(age)
drop age_s
gen age_weight=.
replace age_weight=0.130443886097152 if age==1
replace age_weight=0.125628140703518 if age==2
replace age_weight=0.118058239917536 if age==3
replace age_weight=0.113870635227419 if age==4
replace age_weight=0.110005153975003 if age==5
replace age_weight=0.106123566550702 if age==6
replace age_weight=0.102692951939183 if age==7
replace age_weight=0.098779152171112 if age==8
replace age_weight=0.0943982734183739 if age==9
gen tf_adj_prev= tf_prev* age_weight
save "C:\Users\rwillis\Google Drive\Data Analysis\Stata_Practice_201407\yemen_groupcollapse d_tfprev_20140802.dta"
gen eucluster=substr(group,1,8)
collapse (sum) cluster_tf_prev=tf_adj_prev, by(eucluster)
gen eu=substr(eucluster,1,5)
collapse eu_tf_prev=cluster_tf_prev, by(eu)
save "C:\Users\rwillis\Google Drive\Data Analysis\Stata_Practice_201407\Yemen_EU_Adjprev_20 140802.dta"
So, to summarize:
1. I need code to perform the calculations without collapsing
2. Next, I need help in choosing the correct bootstrap command and putting it in the correct place.
I have asked several colleagues for assistance in this and everyone is either too busy to allow time to wrap their brains around it or just stumped. Please consider this your challenge for the day and help me! :-) OH- and I'm using Stata 10.1.
Comment