Hi all,
Background
I am analyzing data from a RCT with two groups. I am analyzing post-operative troponin values, which are skewed variables. Therefore, I don't want to use parametric methods such as a t-test. I *do*, however, want to report bootstrapped 95% CIs of the difference between groups, and I also want to report a P value testing the null hypothesis that there are no significant differences between groups (i.e. that the 95% CI of the difference in medians includes zero). If the 95% CI of the difference in medians excludes zero, I will conclude there is a statistically significant difference in median troponin values between groups.
Implementation
In the past, I have written a program to simply calculate the difference in medians between groups called mediandiff.ado (link: https://www.dropbox.com/s/84hdzkcprm...ndiff.ado?dl=0), and the dataset for this whole question is found here (link: https://www.dropbox.com/s/y7zlurxvlms0m63/data.dta?dl=0)
I then use this program as input for a bootstrapping procedure, with 10,000 replications. My rationale is that repetitively drawing samples from the dataset, with replacement, will allow me to express the coverage of the differences without any distributional assumptions. I then save these 10,000 differences into a separate dataset which has only one variable: the difference in median troponin value between groups, for each replication. I then can use either the percentile or bias-corrected 95% CI as my confidence interval.
Analysis:
In the following analysis, both the percentile and bias-corrected bootstrapped 95% CIs exclude zero. My question becomes: how can I perform a hypothesis test on these differences in medians to calculate a P value against the null of including zero in the CI?
Output:
My approach was to graph the differences - it is clear that most of the differences are greater than zero. In fact, the proportion of differences less than zero is 0.018. Is, then, 0.018 my P value?
Histogram of Differences in Medians from Bootstrapped Dataset:

Many thanks for any help!
Phil
Background
I am analyzing data from a RCT with two groups. I am analyzing post-operative troponin values, which are skewed variables. Therefore, I don't want to use parametric methods such as a t-test. I *do*, however, want to report bootstrapped 95% CIs of the difference between groups, and I also want to report a P value testing the null hypothesis that there are no significant differences between groups (i.e. that the 95% CI of the difference in medians includes zero). If the 95% CI of the difference in medians excludes zero, I will conclude there is a statistically significant difference in median troponin values between groups.
Implementation
In the past, I have written a program to simply calculate the difference in medians between groups called mediandiff.ado (link: https://www.dropbox.com/s/84hdzkcprm...ndiff.ado?dl=0), and the dataset for this whole question is found here (link: https://www.dropbox.com/s/y7zlurxvlms0m63/data.dta?dl=0)
Code:
*! mediandiff // used for bootstrapping medians // group must be coded as "0" and "1" // pmj 17dec2014 // syntax: "mediandiff number group" capture program drop mediandiff program mediandiff, rclass version 13.1 syntax varlist(min=2 max=2) [if] // first var: real number for difference; second var: indicator var for group marksample touse tokenize `varlist' local number `1' local grp `2' quietly summarize `number' if `grp'==0 & `touse', detail local group_1 = r(p50) quietly summarize `number' if `grp'==1 & `touse', detail local group_2 = r(p50) return scalar difference = `group_2'-`group_1' end
Analysis:
Code:
use data // descriptive stats table group, contents(p25 tnt_6hr p50 tnt_6hr p75 tnt_6hr n tnt_6hr) // generate new dataset containing differences of medians between groups bootstrap r(difference), saving(tnt_bootstrap, replace) level(95) reps(10000) seed(12345) nodots nowarn: mediandiff tnt_6hr group estat bootstrap, all
In the following analysis, both the percentile and bias-corrected bootstrapped 95% CIs exclude zero. My question becomes: how can I perform a hypothesis test on these differences in medians to calculate a P value against the null of including zero in the CI?
Output:
Code:
Bootstrap results Number of obs = 464 Replications = 10000 command: mediandiff tnt_6hr group _bs_1: r(difference) ------------------------------------------------------------------------------ | Observed Bootstrap | Coef. Bias Std. Err. [95% Conf. Interval] -------------+---------------------------------------------------------------- _bs_1 | 69 -.77255 32.641643 5.023556 132.9764 (N) | 3 133.5 (P) | 8 139 (BC) ------------------------------------------------------------------------------ (N) normal confidence interval (P) percentile confidence interval (BC) bias-corrected confidence interval
Histogram of Differences in Medians from Bootstrapped Dataset:
Many thanks for any help!
Phil
Comment