Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • find out the cutoff to maximize correct classification based on estat classification

    I am working on a logit model that predicts companies with material weaknesses in their systems. The dependent variable is mw, which equals 1 for those with material weaknesses and 0 for those with no material weakness. In the output, it presents the rate of being correctly classified. In the table below (based on the command "estat classification, cutoff(0.5)"), this rate is 93.10%. When I tried using different cutoffs with the command "cutoff()", the rate of being correctly classified changed. I wonder if there is a way to find out the cutoff that could maximize the rate of being correctly classified. Thanks!

    True ---- ---
    Classified D ~D Total

    421 101 522
    229 4029 4258

    Total 650 4130 4780

    Classified + if predicted Pr(D) >= .5
    True D defined as mw != 0

    Sensitivity Pr( + D) 64.77%
    Specificity Pr( -~D) 97.55%
    Positive predictive value Pr( D +) 80.65%
    Negative predictive value Pr(~D -) 94.62%

    False + rate for true ~D Pr( +~D) 2.45%
    False - rate for true D Pr( - D) 35.23%
    False + rate for classified + Pr(~D +) 19.35%
    False - rate for classified - Pr( D -) 5.38%

    Correctly classified 93.10%



  • #2
    As a practical matter, you have to do this by trial and error. The sensitivity and specificity of your test are not simple parametric functions of the cutoff--so there is no closed formula solution to this kind of maximization problem.

    That said, why do you want to do this? The correctly classified statistic is pretty much worthless as a measure of test accuracy. First, and most importantly, it fails to distinguish between the two types of misclassifications: misses, and false hits. Because of that it doesn't take into account that the importance of one of those types of errors may be far greater than that of an error in the other direction. You don't say what kind of material you are measuring weaknesses in. But if it is, say, a material that plays a weight bearing role in construction of buildings or machinery, a missed weakness can lead to a building collapse and loss of lives, a false-hit means that a chunk of material is, at worst, discarded, and perhaps just repurposed, recycled, or salvaged in some other way. Why would you be interested in a measure that counts both of those errors equally? One can also construct other scenarios where the false hit is catastrophic and the miss has minimal consequence. And every scenario in between is also possible--so it really depends entirely on the context. So while it is possible that treating both kinds of misclassifications equally will be appropriate to your situation, it is not a given. Indeed, in most real life circumstances it is not the case, and often the disparity of consequences between the two types is great. So you need to think that through.

    The second problem with this measure is that in addition to varying with the cutoff chosen, it also depends critically on the actual prevalence of weaknesses in the specimens to be appraised. Your particular test has low sensitivity and good specificity. If you apply this test to a population of specimens in which weaknesses are rare, even non-existent, the test will look very good: there will be few or no weaknesses to be found, and hardly any will be, and the non-weakness specimens, which are almost the entire batch, will be correctly identified as such. So you will have close to 100% correct identification.

    But now apply the same test to a batch in which half the specimens actually contain a weakness. Say there are 1,000 specimens. 500 contain weaknesses. Of those 500, the test, having a sensitivity of 64.77%, will pick up 324 and miss the remaining 176. Of the 500 that do not contain weaknesses, with specificity 93.10%, 35 will be misclassified. So a total of 211 out of the thousand will be misclassified, for a correctly classified rate of only 78.9%.

    Now, depending on how your sample was chosen, it may be that you can count on it being representative of batches to which this test will be applied in the future. In that case, you can base your calculations on the prevalence found in the sample.

    But the better way to choose a cutoff for a test like this is not to maximize the percent correctly classified, but to minimize a loss function that accounts for the adverse consequences of misclassifications of each type as well as their anticipated frequencies.

    Comment


    • #3
      Thank you very much for your detailed explanations, Clyde! I was replicating a paper, but that paper did not provide much information about how they chose the cutoff to maximize the rate of correct classification. All they stated was "... identifying a cutoff that maximizes the rate of correct classification." This is new to me, and I could not figure out if there is any way or any Stata command to find it out. In my data, the occurrence of weaknesses is low (about 10%). In my case with the low occurrence of weaknesses, should I focus more on Sensitivity and Specificity for the missing cases and false-positive classification rather the rate of correct classification? If so, are there any good ways to find out a cutoff to minimize the missing cases and false-positive classification?

      Thank you very much!

      Comment


      • #4
        While the low occurrence of weaknesses is one factor in deciding what to minimize or maximize, the more important factor is the one you don't mention: the severity of the consequences of the two different types of misclassification. That in turn depends on how this material is used, and what can happen when it fails as a result of an undetected weakness, and what is done with specimens classified as having a weakness even though they really don't. These are not statistical issues: they are real world considerations.

        Regardless of how all of the above plays out, there is no simple command to identify the optimal cutoff point to define a positive test. It is a matter of repeated trials of different thresholds. You can automate this by writing a loop that calculates the sensitivity and specificity associated cutoffs set at 0%, 10%, 20%, ..., 100%. You can then calculate the loss function from that. Loss = prevalence of weakness * (1-sensitivity) * harm of miss + (1-prevalence of weakness) * (1-specificity) * harm of false hit, and you can see which of those cutoffs produces the lowest loss value. Then to get greater precision, if you need it, you can pick the two values around that cutoff and try a series of cutoffs between those two and more closely spaced. Rinse and repeat until you have the level of precision you want.

        Comment

        Working...
        X