Hi all,
I have been writing code for a power calculation based on the advice here & I have reached a bit of a standstill. I'm not sure I'm running it correctly & would really appreciate a sense check/new eyes on the code I have written.
The problem is as follows. An RCT is being planned to compare standard-of-care (DrugA) versus a new drug (DrugB) for the treatment of a disease to assess whether the DrugB causes unwanted weight gain. Weight will be measured at baseline and months 6 and 12. The primary endpoint is the treatment effect of the DrugB at month 12. This will be analysed using the following regression model:
Yt - Yt0 = b0 + b1X + b2 Yt0
Where Yt is the weight change from baseline, X = treatment group, b1 = treatment effect and Yt0 is the baseline weight. In a previous exploratory study, mean baseline weight was 80.6 (sd 11). The newer drug gained on average 3.5kg over 12 months, and the SOC gained 0.6kg (meaning a difference of 1.3kg). The SD for change in weight was 1.7 for both groups.
Using the example code from the other thread as a basis, I have generated the following code to simulate & test various populations. I'm not convinced I have done this correctly - particularly the generation of the post variable & the error variables.
Secondly, if the primary outcome was the change over time, I would need to include the other timepoints. I would need to generate a longitudinal panel & estimate the following regression model:
Yt - Yt0 = b0 + b1X + b2Yt0 + b3time + b4X*time
Where time is binary for month 6 and 12 (0/1). The treatment effect at month 12 is interpretable as b1 + b4. I will run this using either xtreg or the mixed command.
When constructing my inputs, I can expand the data such that each individual has two data points as follows:
But I am unsure how to calculate the outcome variable that is correlated for each individual.
I really appreciate any help anyone can provide on this topic.
Best wishes,
Bryony
I have been writing code for a power calculation based on the advice here & I have reached a bit of a standstill. I'm not sure I'm running it correctly & would really appreciate a sense check/new eyes on the code I have written.
The problem is as follows. An RCT is being planned to compare standard-of-care (DrugA) versus a new drug (DrugB) for the treatment of a disease to assess whether the DrugB causes unwanted weight gain. Weight will be measured at baseline and months 6 and 12. The primary endpoint is the treatment effect of the DrugB at month 12. This will be analysed using the following regression model:
Yt - Yt0 = b0 + b1X + b2 Yt0
Where Yt is the weight change from baseline, X = treatment group, b1 = treatment effect and Yt0 is the baseline weight. In a previous exploratory study, mean baseline weight was 80.6 (sd 11). The newer drug gained on average 3.5kg over 12 months, and the SOC gained 0.6kg (meaning a difference of 1.3kg). The SD for change in weight was 1.7 for both groups.
Using the example code from the other thread as a basis, I have generated the following code to simulate & test various populations. I'm not convinced I have done this correctly - particularly the generation of the post variable & the error variables.
Code:
version 14.2 clear * set seed 123 program define simem // , rclass version 14.2 syntax , [n(integer 80) /// pre(real 80.6) presd(real 11) /// DElta(real 0) deltasd(real 1.7) /// Residual(real 1)] // Add input validation to test drop _all set obs `n' /* set number of observations */ generate int id = _n /* create an id variable */ egen trtgrp = cut(id), group(2) /* generate two treatment groups */ generate u = rnormal() /* random intercept for each individual */ generate e = rnormal(0,`residual') /* residual errors */ generate pre = rnormal(`pre', `presd') /* generate baseline weight variable */ summarize pre, meanonly generate c_pre = pre - r(mean) /* center baseline weight */ generate post = u + trtgrp*rnormal(`delta',`deltasd') + e /* generate post weight variable, accounting for variation in delta */ // Test to run reg post i.trtgrp c_pre test 1.trtgrp end postfile sim delta power lb ub using regression_powersim, replace foreach delta of numlist 1.3 1.7 2.1 2.5 2.9 { simulate p = r(p), reps(5) nolegend nodots: simem , de(`delta') generate byte pos = p < 0.05 qui ci means pos post sim (`delta') (r(mean)) (r(lb)) (r(ub)) display in smcl as text "Delta = " %4.2f `delta' " Power = " %4.2f r(mean) " Lower 95%CI = " %4.2f r(lb) " Upper 95%CI = " %4.2f r(ub) } postclose sim exit
Secondly, if the primary outcome was the change over time, I would need to include the other timepoints. I would need to generate a longitudinal panel & estimate the following regression model:
Yt - Yt0 = b0 + b1X + b2Yt0 + b3time + b4X*time
Where time is binary for month 6 and 12 (0/1). The treatment effect at month 12 is interpretable as b1 + b4. I will run this using either xtreg or the mixed command.
When constructing my inputs, I can expand the data such that each individual has two data points as follows:
Code:
expand 2 bysort id: generate byte time = _n-1
I really appreciate any help anyone can provide on this topic.
Best wishes,
Bryony
Comment