Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Loops







    I have two questions regarding two problems that I have been facing so far.

    First after running a probit model I want to calculate he percentage of correctly called event by using different cutoffs. I am trying to do the following:

    first set the cutoffs: y=.09 y=.1 y=.11 y=.12

    for each y{

    estat classification, cutoff(`y')
    }

    In here I want to test different cutoffs for the predicted probability in a loop. Should I use foreach or forvalue command in this case and how to do that?


    Second, I want to calculate the sum of wrongly called event and don't know how to do that by using Stata command.

    After command
    estat classification, cutoff(.12)


    Click image for larger version

Name:	ttttttt.PNG
Views:	1
Size:	20.1 KB
ID:	1529853


    I want to calculate the sum of the false rate (24.54 + 33.75) in a loop. Is there any way to do that in Stata?


    Thanks in advance














  • #2
    Should I use foreach or forvalue command in this case and how to do that?


    You can do this three different ways:

    [code]
    foreach y of numlist .09 .10 .11 .12 {

    foreach y in .09 .10 .11 .12 {

    forvalues yy = 9(1) 12 {

    local y = `yy'/100

    [code]

    In theory, the -forvalues- approach would be fastest, but in a small loop like this the difference will not be noticeable unless this is buried inside another loop that runs billions of times.

    Note that in the -forvalues- loop I do not use the numbers .09 through .12 directly. That's because floating point arithmetic operations, particularly when repeated, are subject to rounding errors and truncation errors that can lead to serious problems. So the workaround is to use an integer valued iterator and then calculate the corresponding floating point numbers each time through the loop. Now, I strongly suspect that if you did -forvalues yy = .09(.01).12 {- you would get the same results. But in other situations where the number of iterations was larger, or where the increment was much smaller than the starting value, you could get errors. So you may as well form the better habit and iterate with integers whenever possible.

    Second, I want to calculate the sum of wrongly called event and don't know how to do that by using Stata command.

    You can find all of the outputs of -estat classification- stored in r(). The particular two that you have circled here are:
    r(P_n1) and r(P_p0).

    Comment


    • #3
      Thanks a lot Clyde.

      Comment


      • #4
        Dear Clyde,

        Does the lsens command calculates the same for each cutoff and if so how can I get the optimal cutoff from lsens? If not, is there any way to find the optimal cutoff which gives the lowest tootal misspecified error (missed evens + false alarms)

        Comment


        • #5
          Does the lsens command calculates the same for each cutoff

          I don't understand what this means.

          If not, is there any way to find the optimal cutoff which gives the lowest tootal misspecified error (missed evens + false alarms)
          You won't get that directly from -lsens-. In fact, I don't think -lsens- is really particularly helpful for this particular purpose. Rather you will have to loop over potential cutoffs and do counts of misclassifications, storing the results in a separate frame (or in a postfile if you are using a version earlier than 16), and then identify the lowest total number of misclassifications.

          Please do note, however, that I strongly recommend against this definition of optimal cutoff. It is only optimal in the unlikely event that:

          a) True Positives and True negatives are equally likely in the population to which the test is to be applied, and,
          b) false positives and false negatives are equally harmful.

          (Or the balance between prevalence of true positive cases and the disutility ratio of false positives and false negatives exactly balance out.)

          Those conditions are seldom true in the real world. If you are looking to optimize something then you have to assign utilities to these outcomes and minimize the total harm done, not the total number of misclassifications.

          Comment

          Working...
          X