Hypothesis testing for bootstrapped differences in medians in a randomized clinical trial

Philip Jones

Join Date: Mar 2014

Posts: 104
#1

Hypothesis testing for bootstrapped differences in medians in a randomized clinical trial

18 Dec 2014, 07:32

Hi all,

Background
I am analyzing data from a RCT with two groups. I am analyzing post-operative troponin values, which are skewed variables. Therefore, I don't want to use parametric methods such as a t-test. I *do*, however, want to report bootstrapped 95% CIs of the difference between groups, and I also want to report a P value testing the null hypothesis that there are no significant differences between groups (i.e. that the 95% CI of the difference in medians includes zero). If the 95% CI of the difference in medians excludes zero, I will conclude there is a statistically significant difference in median troponin values between groups.

Implementation
In the past, I have written a program to simply calculate the difference in medians between groups called mediandiff.ado (link: https://www.dropbox.com/s/84hdzkcprm...ndiff.ado?dl=0), and the dataset for this whole question is found here (link: https://www.dropbox.com/s/y7zlurxvlms0m63/data.dta?dl=0)

Code:

*! mediandiff // used for bootstrapping medians // group must be coded as "0" and "1" // pmj 17dec2014 // syntax: "mediandiff number group" capture program drop mediandiff program mediandiff, rclass version 13.1 syntax varlist(min=2 max=2) [if] // first var: real number for difference; second var: indicator var for group marksample touse tokenize `varlist' local number `1' local grp `2' quietly summarize `number' if `grp'==0 & `touse', detail local group_1 = r(p50) quietly summarize `number' if `grp'==1 & `touse', detail local group_2 = r(p50) return scalar difference = `group_2'-`group_1' end

I then use this program as input for a bootstrapping procedure, with 10,000 replications. My rationale is that repetitively drawing samples from the dataset, with replacement, will allow me to express the coverage of the differences without any distributional assumptions. I then save these 10,000 differences into a separate dataset which has only one variable: the difference in median troponin value between groups, for each replication. I then can use either the percentile or bias-corrected 95% CI as my confidence interval.

Analysis:

Code:

use data // descriptive stats table group, contents(p25 tnt_6hr p50 tnt_6hr p75 tnt_6hr n tnt_6hr) // generate new dataset containing differences of medians between groups bootstrap r(difference), saving(tnt_bootstrap, replace) level(95) reps(10000) seed(12345) nodots nowarn: mediandiff tnt_6hr group estat bootstrap, all

In the following analysis, both the percentile and bias-corrected bootstrapped 95% CIs exclude zero. My question becomes: how can I perform a hypothesis test on these differences in medians to calculate a P value against the null of including zero in the CI?

Output:

Code:

Bootstrap results Number of obs = 464 Replications = 10000 command: mediandiff tnt_6hr group _bs_1: r(difference) ------------------------------------------------------------------------------ | Observed Bootstrap | Coef. Bias Std. Err. [95% Conf. Interval] -------------+---------------------------------------------------------------- _bs_1 | 69 -.77255 32.641643 5.023556 132.9764 (N) | 3 133.5 (P) | 8 139 (BC) ------------------------------------------------------------------------------ (N) normal confidence interval (P) percentile confidence interval (BC) bias-corrected confidence interval

My approach was to graph the differences - it is clear that most of the differences are greater than zero. In fact, the proportion of differences less than zero is 0.018. Is, then, 0.018 my P value?

Histogram of Differences in Medians from Bootstrapped Dataset:

Many thanks for any help!

Phil
Attached Files

Last edited by Philip Jones; 18 Dec 2014, 07:35.
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#2

18 Dec 2014, 09:47

As an aside, Philip may want to take a look at the following article, which focuses on bootstrapped t-test for investigating the significance of the difference between the means of two samples with skewed cost data:
Desgagné A, Castilloux AM, Angers JF, LeLorier J. The use of the bootstrap statistical method for the pharmacoeconomic cost analysis of skewed data. Pharmacoeconomics. 1998 May;13(5 Pt 1):487-97 (http://www.ncbi.nlm.nih.gov/pubmed/10180748)
The worked example provided in the article is most in the same flavour as the one reported under the -help bootstrap- entry in Stata 13.1 .pdf manual (Example #3).

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4466
#3

18 Dec 2014, 13:09

from a different perspective, I recall that I (long ago) wrote some possibly relevant routines and published them in the STB; search for -johnson-, -obrien- and -modt-; the STB articles are freely available at the Stata web site
Comment
Roman Mostazir

Join Date: Apr 2014

Posts: 874
#4

18 Dec 2014, 16:22

Phil, as a side discussion, if you have pre and post operative troponin values for two groups, ANCOVA model adjusting for baseline score is a valid and established method for identifying group difference at post measure and widely applied in RCTs. What is stopping you devolopping a model like that but with Stata's quantile regression "qreg" testing that the group median difference is zero? Stata's "qreg" command will do that for you and as a side you can also ask for bootstrapped estimation with "bsqreg" command and the test results.

Roman
Comment
Philip Jones

Join Date: Mar 2014

Posts: 104
#5

18 Dec 2014, 18:46

Carlo and Rich - thank you for your comments. I will look up those references.

Roman, Thank you for your comment re: quantile regression. I think it is applicable to my situation (although I cannot condition on baseline troponin since it was only collected post-surgery).

I did a brief search for quantile regression used in RCTs for skewed variables. The literature is sparse, although this seems like a pretty solid strategy.

Of note, the P values obtained from quantile regression were almost exactly the same as those obtained by my method above. I would still like to know if what I wrote above makes sense.

Thanks to all who have contributed so far!

Phil
Comment
Philip Jones

Join Date: Mar 2014

Posts: 104
#6

14 Jan 2015, 07:01

Just to wrap up, I took Roman's suggestion and used quantile (median) regression on my data. I did it conditioning *only* on group.

For those who may want references to support this use in their studies, I include the following:

1. Beyerlein A. Quantile regression-opportunities and challenges from a user's perspective. Am J Epidemiol. 2014;180(3):330-1
2. McGreevy KM, Lipsitz SR, Linder JA, Rimm E, Hoel DG. Using median regression to obtain adjusted estimates of central tendency for skewed laboratory and epidemiologic data. Clin Chem. 2009;55(1):165-9

Phil
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4421
#7

14 Jan 2015, 16:40

Good close-out of the thread, Philip. Thanks.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#8

15 Jan 2015, 02:23

Dear Phil,
thanks for the informative close-out of the thread and for the interesting references.

Kind regards,
Carlo
(Stata 19.0)
Comment
Sebby Zajac

Join Date: May 2016

Posts: 8
#9

17 Aug 2016, 09:18

I am having almost the same issue, but with mean differences and testing if the confidence interval in fact includes 0. I'm using skewed cost data so the interest is in making comparisons on arithmetic means and the difference in these means for hypothesis testing.

Using Philip's method of proportion of differences < 0, I get a p-value of 0.0497. I'm encouraged by his reporting that qreg gave him nearly identical numbers, however I was hoping to garner further insight.

Any ideas or developments since this thread?
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#10

17 Aug 2016, 09:48

Sebby:
doesn't reply #2 to this thread help you out?

Kind regards,
Carlo
(Stata 19.0)
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#11

17 Aug 2016, 09:54

Several thoughts.

1. The proportion of differences < 0 observed with bootstrapping is not the p-value for a null hypothesis significance test (NHST). The p-value for a is based on comparing an observed statistic to its sampling distribution under the null hypothesis. Bootstrap sampling is a sampling distribution, but it does not sample under the null hypothesis.

2. If your sample is sufficiently large, the law of large numbers tells us that the sampling distributions of the means in your two groups will have approximately normal distributions, so it will be OK to use the usual normal-theory based inference for comparing group means.

3. If your sample is not sufficiently large for that, then you have a much bigger problem than the issue of normal-theory vs non-parametric inference. Given the highly skewed nature of cost data in most settings, if your sample is not large enough, you will not have adequately sampled the upper tail of the distribution, which means that your sample means will have a high probability of being far from the population means.

So, I wouldn't look for some exotic way to do inference here: if your sample isn't big enough to justify normal-theory inference about the means, it isn't even close to large enough to provide useful mean estimates in the first place.
1 like
Comment

Announcement

Hypothesis testing for bootstrapped differences in medians in a randomized clinical trial

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment