Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Hypothesis testing for bootstrapped differences in medians in a randomized clinical trial

    Hi all,

    Background
    I am analyzing data from a RCT with two groups. I am analyzing post-operative troponin values, which are skewed variables. Therefore, I don't want to use parametric methods such as a t-test. I *do*, however, want to report bootstrapped 95% CIs of the difference between groups, and I also want to report a P value testing the null hypothesis that there are no significant differences between groups (i.e. that the 95% CI of the difference in medians includes zero). If the 95% CI of the difference in medians excludes zero, I will conclude there is a statistically significant difference in median troponin values between groups.

    Implementation
    In the past, I have written a program to simply calculate the difference in medians between groups called mediandiff.ado (link: https://www.dropbox.com/s/84hdzkcprm...ndiff.ado?dl=0), and the dataset for this whole question is found here (link: https://www.dropbox.com/s/y7zlurxvlms0m63/data.dta?dl=0)

    Code:
    *! mediandiff
    // used for bootstrapping medians
    // group must be coded as "0" and "1"
    // pmj 17dec2014
    // syntax: "mediandiff number group"
    
    capture program drop mediandiff
    program mediandiff, rclass
    
    version 13.1
    syntax varlist(min=2 max=2) [if]    // first var: real number for difference; second var: indicator var for group
    marksample touse
    tokenize `varlist'
    
    local number `1'
    local grp `2'
    
    quietly summarize `number' if `grp'==0 & `touse', detail
    local group_1 = r(p50)    
    
    quietly summarize `number' if `grp'==1 & `touse', detail
    local group_2 = r(p50)    
    
    return scalar difference = `group_2'-`group_1'
    
    end
    I then use this program as input for a bootstrapping procedure, with 10,000 replications. My rationale is that repetitively drawing samples from the dataset, with replacement, will allow me to express the coverage of the differences without any distributional assumptions. I then save these 10,000 differences into a separate dataset which has only one variable: the difference in median troponin value between groups, for each replication. I then can use either the percentile or bias-corrected 95% CI as my confidence interval.

    Analysis:
    Code:
    use data
    
    // descriptive stats
    table group, contents(p25 tnt_6hr p50 tnt_6hr p75 tnt_6hr n tnt_6hr)
    
    // generate new dataset containing differences of medians between groups
    bootstrap r(difference), saving(tnt_bootstrap, replace) level(95) reps(10000) seed(12345) nodots nowarn: mediandiff tnt_6hr group
    estat bootstrap, all

    In the following analysis, both the percentile and bias-corrected bootstrapped 95% CIs exclude zero. My question becomes: how can I perform a hypothesis test on these differences in medians to calculate a P value against the null of including zero in the CI?

    Output:
    Code:
    Bootstrap results                               Number of obs      =       464
                                                    Replications       =     10000
    
          command:  mediandiff tnt_6hr group
            _bs_1:  r(difference)
    
    ------------------------------------------------------------------------------
                 |    Observed               Bootstrap
                 |       Coef.       Bias    Std. Err.  [95% Conf. Interval]
    -------------+----------------------------------------------------------------
           _bs_1 |          69    -.77255   32.641643    5.023556   132.9764   (N)
                 |                                              3      133.5   (P)
                 |                                              8        139  (BC)
    ------------------------------------------------------------------------------
    (N)    normal confidence interval
    (P)    percentile confidence interval
    (BC)   bias-corrected confidence interval
    My approach was to graph the differences - it is clear that most of the differences are greater than zero. In fact, the proportion of differences less than zero is 0.018. Is, then, 0.018 my P value?

    Histogram of Differences in Medians from Bootstrapped Dataset:
    Click image for larger version

Name:	Graph.jpg
Views:	1
Size:	85.4 KB
ID:	564959

    Many thanks for any help!

    Phil
    Attached Files
    Last edited by Philip Jones; 18 Dec 2014, 07:35.

  • #2
    As an aside, Philip may want to take a look at the following article, which focuses on bootstrapped t-test for investigating the significance of the difference between the means of two samples with skewed cost data:
    Desgagné A, Castilloux AM, Angers JF, LeLorier J. The use of the bootstrap statistical method for the pharmacoeconomic cost analysis of skewed data. Pharmacoeconomics. 1998 May;13(5 Pt 1):487-97 (http://www.ncbi.nlm.nih.gov/pubmed/10180748)
    The worked example provided in the article is most in the same flavour as the one reported under the -help bootstrap- entry in Stata 13.1 .pdf manual (Example #3).





    Kind regards,
    Carlo
    (Stata 18.0 SE)

    Comment


    • #3
      from a different perspective, I recall that I (long ago) wrote some possibly relevant routines and published them in the STB; search for -johnson-, -obrien- and -modt-; the STB articles are freely available at the Stata web site

      Comment


      • #4
        Phil, as a side discussion, if you have pre and post operative troponin values for two groups, ANCOVA model adjusting for baseline score is a valid and established method for identifying group difference at post measure and widely applied in RCTs. What is stopping you devolopping a model like that but with Stata's quantile regression "qreg" testing that the group median difference is zero? Stata's "qreg" command will do that for you and as a side you can also ask for bootstrapped estimation with "bsqreg" command and the test results.
        Roman

        Comment


        • #5
          Carlo and Rich - thank you for your comments. I will look up those references.

          Roman, Thank you for your comment re: quantile regression. I think it is applicable to my situation (although I cannot condition on baseline troponin since it was only collected post-surgery).

          I did a brief search for quantile regression used in RCTs for skewed variables. The literature is sparse, although this seems like a pretty solid strategy.

          Of note, the P values obtained from quantile regression were almost exactly the same as those obtained by my method above. I would still like to know if what I wrote above makes sense.

          Thanks to all who have contributed so far!

          Phil

          Comment


          • #6
            Just to wrap up, I took Roman's suggestion and used quantile (median) regression on my data. I did it conditioning *only* on group.

            For those who may want references to support this use in their studies, I include the following:

            1. Beyerlein A. Quantile regression-opportunities and challenges from a user's perspective. Am J Epidemiol. 2014;180(3):330-1
            2. McGreevy KM, Lipsitz SR, Linder JA, Rimm E, Hoel DG. Using median regression to obtain adjusted estimates of central tendency for skewed laboratory and epidemiologic data. Clin Chem. 2009;55(1):165-9

            Phil

            Comment


            • #7
              Good close-out of the thread, Philip. Thanks.

              Comment


              • #8
                Dear Phil,
                thanks for the informative close-out of the thread and for the interesting references.
                Kind regards,
                Carlo
                (Stata 18.0 SE)

                Comment


                • #9
                  I am having almost the same issue, but with mean differences and testing if the confidence interval in fact includes 0. I'm using skewed cost data so the interest is in making comparisons on arithmetic means and the difference in these means for hypothesis testing.

                  Using Philip's method of proportion of differences < 0, I get a p-value of 0.0497. I'm encouraged by his reporting that qreg gave him nearly identical numbers, however I was hoping to garner further insight.

                  Any ideas or developments since this thread?

                  Comment


                  • #10
                    Sebby:
                    doesn't reply #2 to this thread help you out?
                    Kind regards,
                    Carlo
                    (Stata 18.0 SE)

                    Comment


                    • #11
                      Several thoughts.

                      1. The proportion of differences < 0 observed with bootstrapping is not the p-value for a null hypothesis significance test (NHST). The p-value for a is based on comparing an observed statistic to its sampling distribution under the null hypothesis. Bootstrap sampling is a sampling distribution, but it does not sample under the null hypothesis.

                      2. If your sample is sufficiently large, the law of large numbers tells us that the sampling distributions of the means in your two groups will have approximately normal distributions, so it will be OK to use the usual normal-theory based inference for comparing group means.

                      3. If your sample is not sufficiently large for that, then you have a much bigger problem than the issue of normal-theory vs non-parametric inference. Given the highly skewed nature of cost data in most settings, if your sample is not large enough, you will not have adequately sampled the upper tail of the distribution, which means that your sample means will have a high probability of being far from the population means.

                      So, I wouldn't look for some exotic way to do inference here: if your sample isn't big enough to justify normal-theory inference about the means, it isn't even close to large enough to provide useful mean estimates in the first place.

                      Comment

                      Working...
                      X