Non-inferiority trial and sample size simulations

Laura Myles

Join Date: Jun 2018
Posts: 153

Non-inferiority trial and sample size simulations

07 Sep 2023, 07:48

Dear Listers,

I am estimating the necessary sample size for a non-inferiority trial for weight loss comparing a new treatment to a more standard one (control). We are assuming equivalent weight in the two study arms at baseline and setting the non-inferiority margin to 6, based on existing data.

I developed a do file to simulate the data and estimate the required sample size. The script seems to produce a sensible estimate based on an estimated sample using the standard formula; yet I wanted to double-check the script is OK. Should I build more variability around the delta?

I read that the sample size can be adjusted to take into account the correlation between baseline and follow-up scores, so I am also using the simulations to estimate the correlation between those. I'd be inclined to use the mean value as it is in line with other studies in the area... however, I was also wondering if there is a way to simulate the data by specifying an expected correlation; for example 80%.

Any advice welcome!

Code:

/*
Script to estimate sample size needed for a 2-arm RCT on weight loss
New treatment vs. standard treatment (i.e. control) assessing non-inferiority
Weight measurement at baseline and follow-up to be analysed using ANCOVA

we assume same mean and sd for weight at baseline
delta1 = weight reduction in intervention
delta2 = weight reduction in control
m2 = non-inferiority margin
*/


clear
clear all
capture program drop sinem
set seed 679832567

program define sinem, rclass
    version 17.0
    syntax, [n(integer 1) sd(real 1) base(real 1) delta1(real 1) delta2(real 1) ///
    m2(real 1) noBALance]
    
    drop _all
    
    //Participants
    if "`balance'"!= "" local N `n'
        else local N=`n'- mod(`n', 2)
    set obs `N'
        
    g long pid = _n
    
    generate byte trt = mod(_n, 2)
    
    *Outcomes, baseline (out0) and follow-up (out1)
    g double out0 = rnormal(`base', `sd')
    g double out1 = rnormal(out0+cond(trt, `delta1', `delta2') ,`sd')

    corr out1 out0 
    tempname c_rho 
    scalar define `c_rho' = r(rho)
    
    * ANCOVA
    regress out1 i.trt c.out0 
    mat list r(table)
    matrix row1 = r(table)
    tempname yes
    scalar define `yes' = `m2'< row1[5,2]    
    
    return scalar yes= `yes'
    return scalar c_rho = `c_rho'
    
end

simulate powers=r(yes) crho=r(c_rho), reps(1000) seed(2345): sinem, n(520) base(108) delta1(-12) delta2(-12) sd(21) m2(-6) 

tab powers
sum crho

*** Adjustment for correlation - using mean rho value
di 520*(1-.70^2)

Tags: None

Joseph Coveney

Join Date: Apr 2014

Posts: 4433
#2

07 Sep 2023, 20:40

Originally posted by Laura Myles View Post

. . . I was also wondering if there is a way to simulate the data by specifying an expected correlation; for example 80%.

An easy way to specify a correlation is to use official Stata's drawnorm.

You'd change the following section of code in your program as shown in colors below.

Code:

*Outcomes, baseline (out0) and follow-up (out1) /* g double out0 = rnormal(`base', `sd') g double out1 = rnormal(out0+cond(trt, `delta1', `delta2') ,`sd') */ drawnorm out0 out1, double means(`base' `base') sd(`sd' `sd') corr(1 `corr' \ `corr' 1) replace out1 = out1 + cond(trt, `delta1', `delta2') /* corr out1 out0 tempname c_rho scalar define `c_rho' = r(rho) */ correlate out? if !trt tempname c_rho scalar define `c_rho' = r(rho) correlate out? if trt scalar define `c_rho' = (`c_rho' + r(rho)) / 2

And add a default correlation coefficient to the syntax line:

Code:

syntax, [n(integer 1) sd(real 1) base(real 1) delta1(real 1) delta2(real 1) /// m2(real 1) noBALance corr(real 0.8)]

I don't follow the logic of your program, though. Granted, things in noninferiority testing are a bit double-negative, but I think of noninferiority testing as rejecting the null hypothesis of inferiority against the alternative hypothesis of noninferiority. To power such a study, you find the sample size that will reject the null hypothesis of inferiority some percentage of the time when the test treatment as effective as the reference treatment. And would go bust when the test treatment is completely ineffective in comparison to the reference treatment.

Is what you get with your example simulation code set so that the latter is true (control is effective; intervention is completely ineffective)

Code:

simulate powers=r(yes) crho=r(c_rho), reps(1000) seed(2345): sinem, n(520) base(108) delta1(0) delta2(-12) sd(21) m2(-6)

what you're expecting?
1 like
Comment
Laura Myles

Join Date: Jun 2018

Posts: 153
#3

08 Sep 2023, 04:08

Thanks Joseph Coveney , I edited my code with your suggestions in red and it works. Specifying the correlation of the baseline and follow-up scores seem to remove the need to adjust for it after the simulations, as it is taken into account when sampling the outcome - is this a fair observation?

My idea was to use the lower bound (lb) of the regression coefficient to test the null hypothesis; so that if the lb is above the non-inferiority margin, we'd reject the null hypothesis.

In line with the Sealed envelope calculator, I wanted to assume that there is truly no difference between the two arms which is why I specified the difference equal to -12 for both delta1 and delta2; this is why I was not sure if I should build more variability around the delta.

I'd really welcome your thoughts on this.
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4433
#4

08 Sep 2023, 05:43

Originally posted by Laura Myles View Post

Specifying the correlation of the baseline and follow-up scores seem to remove the need to adjust for it after the simulations, as it is taken into account when sampling the outcome - is this a fair observation?

Yes, the adjustment is made in the regression model.

My idea was to use the lower bound (lb) of the regression coefficient to test the null hypothesis; so that if the lb is above the non-inferiority margin, we'd reject the null hypothesis.

I believe that you used the wrong confidence bound. In a weight-loss study, inferiority is weight gain, and you would test that the weight gain in the test treatment group relative to that in the reference treatment group is no worse than the margin of noninferiority. As I said, there's a bit of double-negation going on.

Try this: start out simply (no ANCOVA) by defining effectiveness as weight loss.

Code:

program define simem, rclass version 18.0 syntax , [n(integer 250) Loss(numlist min=2 max=2 >=0) Sd(real 21) mon(real 6)] if "`loss'" == "" { local tst 12 local ref 12 } else gettoken tst ref : loss drop _all quietly set obs `n' generate byte trt = mod(_n, 2) generate double out = rnormal(cond(trt, `tst', `ref'), `sd') regress out i.trt return scalar pos = -`mon' < r(table)["ll", "1.trt"] end

In this program, effectiveness (weight loss) is a positive value and you declare noninferiority when the test treatment's worst (lower confidence bound) weight loss is no worse than the the margin of noninferiority, which expressed as weight gain (negativve weight loss).

In the output below, simulation under the alternative hypothesis gives power. And when the test treatment is inferior to the reference treatment by a magnitude equal to the margin of noninferiority (bottom simulation) you'll get the test size, which is conventionally in some quarters a "single-sided 95% confidence interval", i.e., alpha = 0.025.

.ÿ
.ÿversionÿ18.0

.ÿ
.ÿclearÿ*

.ÿ
.ÿ//ÿseedem
.ÿsetÿseedÿ597390526

.ÿ
.ÿquietlyÿprogramÿdefineÿsimem,ÿrclass

.ÿ
.ÿforvaluesÿnÿ=ÿ450(50)550ÿ{
ÿÿ2.ÿÿÿÿÿquietlyÿsimulateÿposÿ=ÿr(pos),ÿreps(1000):ÿsimemÿ,ÿn(`n')
ÿÿ3.ÿÿÿÿÿsummarizeÿpos,ÿmeanonly
ÿÿ4.ÿÿÿÿÿdisplayÿinÿsmclÿasÿtextÿ"Nÿ=ÿ`n'",ÿ"Powerÿ=ÿ"ÿ%04.2fÿr(mean)
ÿÿ5.ÿ}
Nÿ=ÿ450ÿPowerÿ=ÿ0.86
Nÿ=ÿ500ÿPowerÿ=ÿ0.91
Nÿ=ÿ550ÿPowerÿ=ÿ0.92

.ÿ
.ÿquietlyÿsimulateÿposÿ=ÿr(pos),ÿreps(1000):ÿsimemÿ,ÿn(540)ÿloss(6ÿ12)ÿmon(6)

.ÿsummarizeÿpos

ÿÿÿÿVariableÿ|ÿÿÿÿÿÿÿÿObsÿÿÿÿÿÿÿÿMeanÿÿÿÿStd.ÿdev.ÿÿÿÿÿÿÿMinÿÿÿÿÿÿÿÿMax
-------------+---------------------------------------------------------
ÿÿÿÿÿÿÿÿÿposÿ|ÿÿÿÿÿÿ1,000ÿÿÿÿÿÿÿÿÿ.03ÿÿÿÿ.1706726ÿÿÿÿÿÿÿÿÿÿ0ÿÿÿÿÿÿÿÿÿÿ1

.ÿ
.ÿexit

endÿofÿdo-file

.

For your ANCOVA setup, where you measure body weight per se and not weight loss, effectiveness becomes a negative number and the sign and confidence bound for worse performance (inferiority) are flipped.

Code:

program define simem, rclass version 18.0 syntax , [n(integer 250) /// Baseline(real 100) Loss(numlist min=2 max=2 >=0) Corr(real 0.5) /// Sd(real 21) mon(real 6)] if "`loss'" == "" { local tst 12 local ref 12 } else gettoken tst ref : loss drop _all quietly set obs `n' generate byte trt = mod(_n, 2) drawnorm bas out, double /// means(`baseline' `baseline') sd(`sd' `sd') corr(1 `corr' \ `corr' 1) quietly replace out = out - cond(trt, `tst', `ref') regress out i.trt c.bas return scalar pos = r(table)["ul", "1.trt"] < `mon' end

In the output below, you can see the enhanced statistical efficiency with a correlation coefficient of only 0.5, which I suspect is more realistic than 0.8 given the time frame of a weight-loss study and the expected increase in posttreatment variance in the outcome variable. Test size (ca. 0.025) remains the same at the null hypothesis where the magnitude of inferiority is set equal to the margin of noninferiority.

.ÿ
.ÿversionÿ18.0

.ÿ
.ÿclearÿ*

.ÿ
.ÿ//ÿseedem
.ÿsetÿseedÿ638558990

.ÿ
.ÿquietlyÿprogramÿdefineÿsimem,ÿrclass

.ÿ
.ÿforvaluesÿnÿ=ÿ350(50)450ÿ{
ÿÿ2.ÿÿÿÿÿquietlyÿsimulateÿposÿ=ÿr(pos),ÿreps(1000):ÿsimemÿ,ÿn(`n')
ÿÿ3.ÿÿÿÿÿsummarizeÿpos,ÿmeanonly
ÿÿ4.ÿÿÿÿÿdisplayÿinÿsmclÿasÿtextÿ"Nÿ=ÿ`n'",ÿ"Powerÿ=ÿ"ÿ%04.2fÿr(mean)
ÿÿ5.ÿ}
Nÿ=ÿ350ÿPowerÿ=ÿ0.87
Nÿ=ÿ400ÿPowerÿ=ÿ0.92
Nÿ=ÿ450ÿPowerÿ=ÿ0.94

.ÿ
.ÿquietlyÿsimulateÿposÿ=ÿr(pos),ÿreps(1000):ÿsimemÿ,ÿn(540)ÿl(6ÿ12)ÿmon(6)

.ÿsummarizeÿpos

ÿÿÿÿVariableÿ|ÿÿÿÿÿÿÿÿObsÿÿÿÿÿÿÿÿMeanÿÿÿÿStd.ÿdev.ÿÿÿÿÿÿÿMinÿÿÿÿÿÿÿÿMax
-------------+---------------------------------------------------------
ÿÿÿÿÿÿÿÿÿposÿ|ÿÿÿÿÿÿ1,000ÿÿÿÿÿÿÿÿÿ.02ÿÿÿÿ.1400701ÿÿÿÿÿÿÿÿÿÿ0ÿÿÿÿÿÿÿÿÿÿ1

.ÿ
.ÿexit

endÿofÿdo-file

.

In line with the Sealed envelope calculator, I wanted to assume that there is truly no difference between the two arms which is why I specified the difference equal to -12 for both delta1 and delta2; this is why I was not sure if I should build more variability around the delta.

I'd really welcome your thoughts on this.

Yes, you assess power by simulating under the alternative hypothesis, and I wasn't quibbling about that. It was that with your program you get 100% rejection of the null hypothesis of inferiority when the intervention is complete inactive.
Attached Files

ANCOVA Setup.do (1,013 Bytes, 1 view)

Simple Weight-loss Setup.do (847 Bytes, 1 view)
4 likes
Comment
Laura Myles

Join Date: Jun 2018

Posts: 153
#5

08 Sep 2023, 09:53

Dear Joseph Coveney , I had completely missed the double negative. Thank you so much for the thorough examples and clarification!
Comment
Laura Myles

Join Date: Jun 2018

Posts: 153
#6

15 Sep 2023, 09:26

Joseph Coveney I am now considering, having 3 groups that I want to compare for weight loss; how can I account for multiple testing and adjust, should I adjust the level of the confidence interval?

Code:

regress out i.trt c.bas, level(97.5)
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4433
#7

16 Sep 2023, 05:51

You seem to be taking a Bonferroni approach to adjusting the confidence bound. I don't have any pithy advice to give you on this, and I assume that you've Googled for help already. Have you looked at the (few) literature treatments of this, say, here and here?
1 like
Comment
Laura Myles

Join Date: Jun 2018

Posts: 153
#8

03 Oct 2023, 08:19

Thanks again Joseph Coveney
Comment

Laura Myles

Join Date: Jun 2018
Posts: 153

03 Oct 2023, 12:33

Joseph Coveney I have a follow-up question.

I'd like to check the sample size calculations using a change approach rather than ANCOVA; the SD of the change score seems to be the same as the SD of baseline weight - is there a way to specify it?

Code:

version 17.0

clear
clear all
drop _all
capture program drop sinem
set seed 679832567

program define sinem, rclass
    version 17.0
    syntax, [n(integer 1) sd(real 1) base(real 1) delta1(real 1) delta2(real 1) ///
    alpha(real 0.05) m2(real 1) noBALance corr(real 1) level (real 1)]
    
    //Participants
    if "`balance'"!= "" local N `n'
        else local N=`n'- mod(`n', 2)
    set obs `N'
        
    g long pid = _n
    
    generate byte trt = mod(_n, 2)
    
    drawnorm out0 out1, double means(`base' `base') sd(`sd' `sd') corr(1 `corr' \ `corr' 1)
    replace out1 = out1 + cond(trt, `delta1', `delta2')
    
    correlate out* if !trt
    tempname c_rho
    scalar define `c_rho' = r(rho)
    correlate out? if trt
    scalar define `c_rho' = (`c_rho' + r(rho)) / 2
    
    * Change
    g d = out0-out1
    sum d 
    tempname sdd
    scalar define `sdd'=r(sd)
    
    regress d i.trt c.out0 , level(`level')
    mat list r(table)
    tempname changeyes
    scalar define `changeyes' = `m2'<r(table)["ll", "1.trt"]
    
    *ANCOVA
    regress out1 i.trt c.out0 , level(`level')
    mat list r(table)
    tempname ancovayes
    scalar define `ancovayes' = `m2'<r(table)["ll", "1.trt"]
    
    return scalar ancovayes= `ancovayes'
    return scalar changeyes= `changeyes'
    return scalar n = `N'
    return scalar c_rho = `c_rho'
    return scalar sdd = `sdd'
    
end

* 
simulate n=r(n) power_ancova=r(ancovayes)power_change=r(changeyes)  crho=r(c_rho) sdd=r(sdd), reps(500) seed(2345): sinem, n(390) base(108) delta1(-12) delta2(-12) sd(21) m2(-6) level(95) corr(0.5)

tab power_ancova
tab power_change
sum sdd

Comment

Joseph Coveney

Join Date: Apr 2014

Posts: 4433
#10

03 Oct 2023, 20:31

Originally posted by Laura Myles View Post

. . . the SD of the change score seems to be the same as the SD of baseline weight - is there a way to specify it?

They'll be the same when the correlation is 0.5; you can decrease the change score's SD by specifying a higher correlation and increase it by specifying a lower correlation.

I'd like to check the sample size calculations using a change approach rather than ANCOVA

There's literature pertinent to this. I recommend Googling harrell ancova change-scores correlation or similar search terms and taking a look at the various hits. Some, for example, this one, are relatively recent. The literature is more in the context of testing for a difference ("superiority"), but the findings regarding relative statistical efficiency are relevant to the cases of noninferiority and equivalence.
1 like
Comment

Announcement

Non-inferiority trial and sample size simulations

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment