Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • heckroc: New Stata command for creating ROC curves from selected samples

    The Stata command heckroc is now available on SSC (ssc install heckroc). This command implements a procedure creating ROC curves with selected samples.

    Receiver operating characteristic (ROC) curves are widely used in many fields to measure the performance of ratings. An advantage of ROC curves over metrics like accuracy (defined as the number of cases correctly predicted) is that ROC curves provide the full range of trade-offs between true positives and false negatives. Despite their widespread usage, the effects of sample selection on ROC curves was not explored until recently.

    Sample selection is common in many areas. Consider a medical test that is only administered to patients that are referred by their physicians. We want to know how well the test correctly diagnoses illness, but we only observe test results for referred patients. A different, but related, problem arises in commercial banking. The Basel Accords require banks to estimate the probability of default for their loans. To assess the predictive performance of their probability of default models, banks could construct a ROC curve with the sample of loan applicants that were granted loans.

    Hand and Adams (2014) appear to have been the first to discuss selection bias for ROC curves. Cook (2017) presents a procedure to plot an ROC curve that is a consistent estimate of the ROC curve that would be obtained with a random sample. The heckroc command implements Cook's procedure and provides confidence intervals for the area under the curve and confidence bands for the ROC curve.

    There are many existing Stata commands for plotting ROC curves, including roctab and roccomp, but none of these commands correct for the effects of sample selection. The syntax of heckroc was kept close to existing Stata commands for sample-selection problems, i.e., heckman, heckprobit, and heckoprobit. Like heckprobit and heckoprobit, heckroc is based on assumptions similar to those of Heckman (1976). The output from heckroc was designed to be similar to that of Stata's built-in commands for ROC curves.


    See the help file for syntax and examples:
    Code:
    help heckroc
    References:
    Cook, J. A. (2017). ROC curves and nonrandom data. Pattern Recognition Letters, 85, 35-41. http://dx.doi.org/10.1016/j.patrec.2016.11.015

    Hand, D. J., & Adams, N. M. (2014). Selection bias in credit scorecard evaluation. Journal of the Operational Research Society, 65(3), 408-415.

    Heckman, J. J. (1976). The common structure of statistical models of truncation, sample selection and limited dependent variables and a simple estimator for such models. Annals of Economic and Social Measurement, 5(4), 475-492.
Working...
X