Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using bootstrap on performance calculations (sens, spec, PPV, NVP) to obtain CI and standard error

    Hi everyone,

    I have spent the weekend browsing this excellent forum and the web for a solution to my problem, but unfortunately without succeding. Any help would be very much appreciated.

    In STATA 16.1, I have a dataset of 18 patients of whom 8 have viable disease, and we are testing a biomarker. 4 tests are positive, and three of those are true positives. The data is added below, the varible PADRPLND is viable disease and the variable POSMIR is a positive marker.

    I have done performance calculations to obtain sensitivity, specificity, PPV and NPV, either by the diagt-command, or by logistic regression and then estat classification.

    I now need to add the bootstrap by a 1000 reps to obtain the 95% CI and the standard error, for these four calculations.

    The main problem seems to be that I can't store the previous estimates for the bootstrap procedure to use. I have then tried to write a program for the calculations to loop in the bootstrap, and seem to have success with the sens and spec but not the PPV and NPV.

    I also need an upper bound so that the CI don't surpass 100% in any of the calculations.

    Thank you for your help and I hope this is somewhat clear, since it's my first post.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int Id byte(PADRPLND POSMIR)
    7019 0 0
    7020 1 0
    7036 0 0
    7038 0 0
    7056 0 1
    7067 0 0
    7071 0 0
    7074 1 0
    7075 0 0
    7078 1 0
    7081 0 0
    7084 0 0
    7506 1 1
    7507 1 1
    2045 1 0
    2057 1 1
    2220 0 0
    2236 1 0
    end
    
    
    logit PADRPLND POSMIR
    
    
    estat classification
    
    
    scalar sensitivity = r(Sensitivity)
    
    
    scalar specificity = r(Specificity)
    
    
    scalar ppv = r(Positive predictive value)
    
    
    scalar npv = r(Negative predictive value)
    
    
    // Bootstrap sensitivity
    bootstrap, reps(1000) seed(12345): 
        // Calculate sensitivity
        r(sensitivity) = sens_est
    
    // Bootstrap specificity
    bootstrap, reps(1000) seed(12345): 
        // Calculate specificity
        r(specificity) = spec_est
    
    // Bootstrap PPV
    bootstrap, reps(1000) seed(12345): 
        // Calculate PPV
        r(ppv) = ppv_est
    
    // Bootstrap NPV
    bootstrap, reps(1000) seed(12345): 
        // Calculate NPV
        r(npv) = npv_est
    by this, STATA replies "last estimates not found".

    Also if I use,

    Code:
    
    di "Sensitivity: " sensitivity
    di "Specificity: " specificity
    di "PPV: " ppv
    di "NPV: " npv
    I only get dots which I think means that the estimates are not stored.

    The other way I tried was by using this code I found in the forum;

    Code:
    capture program drop operating_characteristics
    program define operating_characteristics, rclass
        summ POSMIR if PADRPLND, meanonly
        local sensitivity = r(mean)
        summ POSMIR if !PADRPLND, meanonly
        local specificity = 1 - r(mean)
        return scalar sensitivity = `sensitivity'
        return scalar specificity = `specificity'
        return scalar likelihood_ratio = `sensitivity'/(1-`specificity')
        exit
    end
    
    bootstrap sens = r(sensitivity) spec = r(specificity) lr = r(likelihood_ratio), ///
        reps(500): operating_characteristics
    These are the results

    Code:
    Bootstrap results                               Number of obs     =         18
                                                    Replications      =        325
    
          command:  operating_characteristics
             sens:  r(sensitivity)
             spec:  r(specificity)
               lr:  r(likelihood_ratio)
    
    ------------------------------------------------------------------------------
                 |   Observed   Bootstrap                         Normal-based
                 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
            sens |       .375   .1804689     2.08   0.038     .0212874    .7287126
            spec |         .9    .075812    11.87   0.000     .7514113    1.048589
              lr |       3.75   2.094575     1.79   0.073    -.3552913    7.855291
    ------------------------------------------------------------------------------
    Note: One or more parameters could not be estimated in 175 bootstrap
          replicates; standard-error estimates include only complete replications.
    Problem here is that the upper interval of spec exceeds 100%.

    If anyone could please help me expand the code for PPV and NPV, my problem would be solved. Thank you!

  • #2
    As far as I can see the observation that the upper limit of the CI exceeds 1 (or 100%) -- it could equally be that the lower limit is less than 0 -- is due to the fact that the CIs calculated are normal-based. However, proportions -- especially if they are closer to 0 or 1 -- should have asymmetric CIs. See, for example Brown et al. (2001) [reference is given below].

    At the risk of committing a serious blunder (I don't feel that I am an expert here, hence others should correct me!): If you are looking for 95% CIs of PPV and NPV you need no bootstrap procedure but can calculate these values directly from the logistic regression results. To obtain asymmetric confidence intervals you can use margins to predict logits and subsequently transfrom the logits using transform_margins (see here and in the syntax):
    Code:
    . input int Id byte(PADRPLND POSMIR)
    
               Id  PADRPLND    POSMIR
      1. 7019 0 0
      2. 7020 1 0
      3. 7036 0 0
      4. 7038 0 0
      5. 7056 0 1
      6. 7067 0 0
      7. 7071 0 0
      8. 7074 1 0
      9. 7075 0 0
     10. 7078 1 0
     11. 7081 0 0
     12. 7084 0 0
     13. 7506 1 1
     14. 7507 1 1
     15. 2045 1 0
     16. 2057 1 1
     17. 2220 0 0
     18. 2236 1 0
     19. end
    
    . // Install transform_margins if necessary:
    . cap which transform_margins
    
    . if _rc net install transform_margins, from(http://www.stata.com/users/jpitblado)
    
    . logit PADRPLND i.POSMIR
    
    Iteration 0:  Log likelihood = -12.365308  
    Iteration 1:  Log likelihood = -11.376829  
    Iteration 2:  Log likelihood = -11.373933  
    Iteration 3:  Log likelihood = -11.373932  
    
    Logistic regression                                     Number of obs =     18
                                                            LR chi2(1)    =   1.98
                                                            Prob > chi2   = 0.1591
    Log likelihood = -11.373932                             Pseudo R2     = 0.0802
    
    ------------------------------------------------------------------------------
        PADRPLND | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
        1.POSMIR |   1.686398   1.282359     1.32   0.188    -.8269787    4.199775
           _cons |  -.5877867   .5577734    -1.05   0.292    -1.681002     .505429
    ------------------------------------------------------------------------------
    
    . estat classification
    
    Logistic model for PADRPLND
    
                  -------- True --------
    Classified |         D            ~D  |      Total
    -----------+--------------------------+-----------
         +     |         3             1  |          4
         -     |         5             9  |         14
    -----------+--------------------------+-----------
       Total   |         8            10  |         18
    
    Classified + if predicted Pr(D) >= .5
    True D defined as PADRPLND != 0
    --------------------------------------------------
    Sensitivity                     Pr( +| D)   37.50%
    Specificity                     Pr( -|~D)   90.00%
    Positive predictive value       Pr( D| +)   75.00%
    Negative predictive value       Pr(~D| -)   64.29%
    --------------------------------------------------
    False + rate for true ~D        Pr( +|~D)   10.00%
    False - rate for true D         Pr( -| D)   62.50%
    False + rate for classified +   Pr(~D| +)   25.00%
    False - rate for classified -   Pr( D| -)   35.71%
    --------------------------------------------------
    Correctly classified                        66.67%
    --------------------------------------------------
    
    . // The PPV corresponds to the predicted proportion for POSMIR = 1
    . // With transform_margins of the predicted logits you get the logit CI:
    . qui margins POSMIR, predict(xb)
    
    . transform_margins invlogit(@)
    ----------------------------------------------
                 |         b         ll         ul
    -------------+--------------------------------
          POSMIR |
              0  |  .3571429   .1569628   .6237343
              1  |  .7499999   .2378398   .9664886
    ----------------------------------------------
    
    . // The NPV corresponds to the predicted proportion for POSMIR = 0
    . // if PADPRLND is reversed:
    . // With transform_margins of the predicted logits you get the logit CI:
    . recode PADRPLND (1=0) (0=1), gen(PADRPLND_r)
    (18 differences between PADRPLND and PADRPLND_r)
    
    . logit PADRPLND_r i.POSMIR
    
    Iteration 0:  Log likelihood = -12.365308  
    Iteration 1:  Log likelihood = -11.376829  
    Iteration 2:  Log likelihood = -11.373933  
    Iteration 3:  Log likelihood = -11.373932  
    
    Logistic regression                                     Number of obs =     18
                                                            LR chi2(1)    =   1.98
                                                            Prob > chi2   = 0.1591
    Log likelihood = -11.373932                             Pseudo R2     = 0.0802
    
    ------------------------------------------------------------------------------
      PADRPLND_r | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
        1.POSMIR |  -1.686398   1.282359    -1.32   0.188    -4.199775    .8269787
           _cons |   .5877867   .5577734     1.05   0.292     -.505429    1.681002
    ------------------------------------------------------------------------------
    
    . qui margins POSMIR, predict(xb)
    
    . transform_margins invlogit(@)
    ----------------------------------------------
                 |         b         ll         ul
    -------------+--------------------------------
          POSMIR |
              0  |  .6428571   .3762657   .8430372
              1  |  .2500001   .0335114   .7621602
    ----------------------------------------------
    Reference:
    • Brown, L. D., Cai, T. T., & DasGupta, A. (2001). Interval estimation for a binomial proportion. Statistical Science, 16, 101–133.
    Last edited by Dirk Enzmann; 12 May 2024, 15:07.

    Comment


    • #3
      Thank you so much prof Enzmann for great reply and code, however, the request of bootstrapping to my performance calculations was made by the chief statistician at a medical journal. Hence I have to use the bootstrap to my analysis.

      Comment


      • #4
        Originally posted by Anna Thor View Post
        . . . the request of bootstrapping to my performance calculations was made by the chief statistician at a medical journal. Hence I have to use the bootstrap to my analysis.
        Just for my edification, I'd like to confirm the following.

        You presented the medical journal this table:

        .ÿ
        .ÿversionÿ18.0

        .ÿ
        .ÿclearÿ*

        .ÿ
        .ÿquietlyÿinputÿintÿIdÿbyte(PADRPLNDÿPOSMIR)

        .ÿ
        .ÿlabelÿdefineÿFindingsÿ0ÿNegativeÿ1ÿPositive

        .ÿlabelÿvaluesÿPOSMIRÿFindings

        .ÿlabelÿdefineÿTruthÿ0ÿNodiseaseÿ1ÿDisease

        .ÿlabelÿvaluesÿPADRPLNDÿTruth

        .ÿtabulateÿPOSMIRÿPADRPLND

        ÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿÿÿÿPADRPLND
        ÿÿÿÿPOSMIRÿ|ÿNodiseaseÿÿÿÿDiseaseÿ|ÿÿÿÿÿTotal
        -----------+----------------------+----------
        ÿÿNegativeÿ|ÿÿÿÿÿÿÿÿÿ9ÿÿÿÿÿÿÿÿÿÿ5ÿ|ÿÿÿÿÿÿÿÿ14ÿ
        ÿÿPositiveÿ|ÿÿÿÿÿÿÿÿÿ1ÿÿÿÿÿÿÿÿÿÿ3ÿ|ÿÿÿÿÿÿÿÿÿ4ÿ
        -----------+----------------------+----------
        ÿÿÿÿÿTotalÿ|ÿÿÿÿÿÿÿÿ10ÿÿÿÿÿÿÿÿÿÿ8ÿ|ÿÿÿÿÿÿÿÿ18ÿ

        .ÿ
        .ÿexit

        endÿofÿdo-file


        .


        And the journal's chief statistician said that you must bootstrap that?

        Comment


        • #5
          Correct.
          I have two additional groups of patients, for instance 24 patients, 23 viable disease, 17 pos test. Very simple statistics but bootstrap is requested for consideration of acceptance.
          Last edited by Anna Thor; 13 May 2024, 04:56.

          Comment


          • #6
            If they request it, why then don't you simply report the confidence intervals obtained with bootstrap? Perhaps you should use bias-corrected and accelerated (bca) CIs.

            Comment


            • #7
              Originally posted by Dirk Enzmann View Post
              If they request it, why then don't you simply report the confidence intervals obtained with bootstrap? Perhaps you should use bias-corrected and accelerated (bca) CIs.
              I am sorry if I was unclear, I am able to to do the bootstrap for sens and spec, but I can't figure out how to expand the code for bootstrap to PPV and NPV. Below is the code I found on the forum and employed.

              So I am kindly requesting help to add the bootstrap for PPV and NPV, either by the commands below or any other strategy.

              Code:
                
              capture program drop operating_characteristics program
              define operating_characteristics, rclass    
              summ POSMIR if PADRPLND, meanonly    
              local sensitivity = r(mean)    
              summ POSMIR if !PADRPLND, meanonly    
              local specificity = 1 - r(mean)    
              return scalar sensitivity = `sensitivity'    
              return scalar specificity = `specificity'    
              return scalar likelihood_ratio = `sensitivity'/(1-`specificity')    
              exit
              end  
              
              bootstrap sens = r(sensitivity) spec = r(specificity) lr = r(likelihood_ratio), ///    
              reps(500): operating_characteristics
              You can see the results of this at the end of my first post, I was not able to copy it in the forum format with a reasonable appearance.

              And thank you!
              Last edited by Anna Thor; 13 May 2024, 10:17.

              Comment


              • #8
                What about
                Code:
                clear
                input int Id byte(PADRPLND POSMIR)
                7019 0 0
                7020 1 0
                7036 0 0
                7038 0 0
                7056 0 1
                7067 0 0
                7071 0 0
                7074 1 0
                7075 0 0
                7078 1 0
                7081 0 0
                7084 0 0
                7506 1 1
                7507 1 1
                2045 1 0
                2057 1 1
                2220 0 0
                2236 1 0
                end
                
                logit PADRPLND POSMIR
                estat classification
                
                capture program drop operating_characteristics
                program define operating_characteristics, rclass    
                   summ POSMIR if PADRPLND, meanonly    
                   local sensitivity = r(mean)    
                   summ POSMIR if !PADRPLND, meanonly    
                   local specificity = 1 - r(mean)
                   summ PADRPLND if POSMIR, meanonly
                   local PPV = r(mean)
                   summ PADRPLND if !POSMIR, meanonly
                   local NPV = 1 - r(mean)
                   return scalar sensitivity = `sensitivity'    
                   return scalar specificity = `specificity'
                   return scalar PPV = `PPV'
                   return scalar NPV = `NPV'
                   return scalar likelihood_ratio = `sensitivity'/(1-`specificity')    
                end  
                
                bootstrap sens = r(sensitivity) spec = r(specificity) lr = r(likelihood_ratio) ///
                          PPV = r(PPV) NPV = r(NPV), ///    
                          reps(1700): operating_characteristics
                If you wonder why, try to make sense of the symbols in Pr(...) in the following output table:
                Code:
                --------------------------------------------------
                Sensitivity                     Pr( +| D)   37.50%
                Specificity                     Pr( -|~D)   90.00%
                Positive predictive value       Pr( D| +)   75.00%
                Negative predictive value       Pr(~D| -)   64.29%
                --------------------------------------------------
                By the way: You should run at least 1,000 bootstrap replications, but because there are so many errors in the execution of bootstrap I set it to 1,700. The errors are most likely due to the small sample size (and/or small cell frequencies) making the analysis a questionable undertaking.

                Comment


                • #9
                  Thank you so much! I will try this. So grateful for your help.

                  Comment


                  • #10
                    To be able to replicate the results you should use the seed() option, e.g. seed(12345).

                    Comment


                    • #11
                      Dear prof Enzmann and the rest of the users of Statalist,

                      It worked fine and I am so grateful for your help.

                      However, I was not able to use the same code for my next group of patients, because in this group the predictor variable perfectly predicts the outcome variable with 100% specifiticity, and therefore STATA drops the 17 patients with viable disease and positive tests, in logistic regression.

                      I have therefore used firthlogit instead. However, the estat classification command that I previously used is not available for firth, or exlogit. I have made a program called my_stats to calculate sens, spec, PPV and NPV. However when I add the bootstrap I get the reply from STATA that "r(sensitivity) evaluated to missing in full sample". Is this the same issue repeating...?


                      clear
                      input int Id byte(PADRPLND POSMIR)
                      7009 1 1
                      7013 1 1
                      7014 1 1
                      7026 1 0
                      7027 1 0
                      7028 1 1
                      7030 1 1
                      7037 1 1
                      7040 1 1
                      7046 1 1
                      7049 0 0
                      7052 1 0
                      7053 1 1
                      7055 1 0
                      7062 1 1
                      7079 1 1
                      7085 1 0
                      7502 1 1
                      7504 1 1
                      2070 1 1
                      2076 1 1
                      2094 1 0
                      2184 1 1
                      2192 1 1
                      end
                      [/CODE]


                      Code:
                      . diagt PADRPLND POSMIR
                      
                                 |        POSMIR
                        PADRPLND |      Pos.       Neg. |     Total
                      -----------+----------------------+----------
                        Abnormal |        17          6 |        23
                          Normal |         0          1 |         1
                      -----------+----------------------+----------
                           Total |        17          7 |        24
                      True abnormal diagnosis defined as PADRPLND = 1
                      
                      
                                                                        [95% Confidence Interval]
                      ---------------------------------------------------------------------------
                      Prevalence                         Pr(A)     95.8%     78.9%      99.9%
                      ---------------------------------------------------------------------------
                      Sensitivity                      Pr(+|A)     73.9%     51.6%     89.8%
                      Specificity                      Pr(-|N)    100.0%      2.5%    100.0%
                      ROC area               (Sens. + Spec.)/2      0.87         .      1.00
                      ---------------------------------------------------------------------------
                      Likelihood ratio (+)     Pr(+|A)/Pr(+|N)         .         .         .
                      Likelihood ratio (-)     Pr(-|A)/Pr(-|N)      0.26      0.13      0.52
                      Odds ratio                   LR(+)/LR(-)         .      0.00         .
                      Positive predictive value        Pr(A|+)    100.0%     80.5%    100.0%
                      Negative predictive value        Pr(N|-)     14.3%      0.4%     57.9%
                      ---------------------------------------------------------------------------
                      
                        Missing values or confidence intervals may be estimated
                        using the -sf- or -sf0- options.

                      Code:
                      
                      program define my_stats, rclass
                      
                          firthlogit PADRPLND POSMIR
                          predict prob
                      
                          
                          gen predicted_class = (prob > 0.5)
                      
                        
                          tabulate PADRPLND predicted_class, matcell(confusion_matrix)
                      
                        
                          local TP = confusion_matrix[2,2]  // True Positives
                          local TN = confusion_matrix[1,1]  // True Negatives
                          local FP = confusion_matrix[1,2]  // False Positives
                          local FN = confusion_matrix[2,1]  // False Negatives
                      
                          
                          local sensitivity = `TP' / (`TP' + `FN')
                          local specificity = `TN' / (`TN' + `FP')
                          local PPV = `TP' / (`TP' + `FP')
                          local NPV = `TN' / (`TN' + `FN')
                      
                       
                          return scalar sensitivity = `sensitivity'
                          return scalar specificity = `specificity'
                          return scalar PPV = `PPV'
                          return scalar NPV = `NPV'
                      end
                      Code:
                      
                      . my_stats
                      
                      initial:       penalized log likelihood = -4.9879767
                      rescale:       penalized log likelihood = -4.9879767
                      Iteration 0:   penalized log likelihood = -4.9879767  
                      Iteration 1:   penalized log likelihood = -3.7872439  
                      Iteration 2:   penalized log likelihood = -3.7558953  
                      Iteration 3:   penalized log likelihood = -3.7558086  
                      Iteration 4:   penalized log likelihood = -3.7558086  
                      
                                                                      Number of obs     =         24
                                                                      Wald chi2(1)      =       1.52
                      Penalized log likelihood = -3.7558086           Prob > chi2       =     0.2181
                      
                      ------------------------------------------------------------------------------
                          PADRPLND |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                      -------------+----------------------------------------------------------------
                            POSMIR |   2.089011   1.696366     1.23   0.218    -1.235804    5.413826
                             _cons |   1.466337   .9058217     1.62   0.105    -.3090408    3.241715
                      ------------------------------------------------------------------------------
                      
                                 | predicted_
                                 |   class
                        PADRPLND |         1 |     Total
                      -----------+-----------+----------
                               0 |         1 |         1
                               1 |        23 |        23
                      -----------+-----------+----------
                           Total |        24 |        24
                      Now the last step that does not work

                      Code:
                      
                      . bootstrap r(sensitivity) r(specificity) r(PPV) r(NPV), reps(2000) seed(12345): my_stats
                      (running my_stats on estimation sample)
                      'r(sensitivity)' evaluated to missing in full sample
                      Any help would be very much appreciated. Thank you in advance.
                      Last edited by Anna Thor; 15 May 2024, 13:38.

                      Comment


                      • #12
                        But what else would you expect from a sample size of 1 in the group defined by PADRPLND = 1? You simply cannot do things that are impossible.

                        Comment


                        • #13
                          Thank you again prof Enzmann for your help. That is what we thought too, however, this has been a specific request made by the chief statistician - yes - twice. I have now sent this issue to the chief statistician at our department for review.

                          Comment

                          Working...
                          X