Hi,
I'll be running an RCT (baseline and endline) with two treatment arms (T1, T2 and C) and three population subgroups (G1, G2 and G3) to measure the impact of each intervention on the prevalence of high school admission (binary variable). Villages will be sampled using a stratified approach - in each study zone, the main city will be sampled (split into three, one for each treatment group), as well as 3 villages within 10 to 20km from the main city, and an additional 3 villages within 20 to 30km of the main city (a village per treatment group). In each village (and main city), individuals from each subgroup will be selected such that the number of G1 respondents is the same as of G2 and G3. (see image)
Unfortunately, I'm not 100% how to compute the power calculations. My understanding is that, assuming similar subgroup size and population size by village/main city, I should essentially run the following code:
where I assumed that the mean high school admission is 0.3 and in T1 0.4, but with a higher prevalence in G2 (0.6) compared to G1 (0.3), hence an nratio of 60:30=2:1.
I would then need to replicate this analysis for all combinations and, I guess, divide this total sample size due to redundancies in the analysis.
I was first wondering whether it made sense, especially given that
a) the outcome of interest is binary
b) there are three subgroups
c) there are three treatments arms
d) it's stratified sampling, not cluster
e) there will be two survey waves with covariates included - how should they be factored in the power calculations as follow-up surveys and covariates inclusion should increase power.
Your help would be greatly appreciated.
I'll be running an RCT (baseline and endline) with two treatment arms (T1, T2 and C) and three population subgroups (G1, G2 and G3) to measure the impact of each intervention on the prevalence of high school admission (binary variable). Villages will be sampled using a stratified approach - in each study zone, the main city will be sampled (split into three, one for each treatment group), as well as 3 villages within 10 to 20km from the main city, and an additional 3 villages within 20 to 30km of the main city (a village per treatment group). In each village (and main city), individuals from each subgroup will be selected such that the number of G1 respondents is the same as of G2 and G3. (see image)
Unfortunately, I'm not 100% how to compute the power calculations. My understanding is that, assuming similar subgroup size and population size by village/main city, I should essentially run the following code:
Code:
power twoproportions 0.3 0.4, test(chi2) power(0.8) nratio(2)
I would then need to replicate this analysis for all combinations and, I guess, divide this total sample size due to redundancies in the analysis.
I was first wondering whether it made sense, especially given that
a) the outcome of interest is binary
b) there are three subgroups
c) there are three treatments arms
d) it's stratified sampling, not cluster
e) there will be two survey waves with covariates included - how should they be factored in the power calculations as follow-up surveys and covariates inclusion should increase power.
Your help would be greatly appreciated.