Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Finding observations that are strictly dominated: how do I eliminate multiple loops

    I have a dataset that looks as follows. For each state, I know which individuals are accepted (acc = 1) or rejected (acc = 0) on the basis of two scores x and y. A want to say that a rejection is bad if there is an individual in that state that is accepted and that has worse scores on both x and y. I can compute this looping through states, accepted, and rejected candidates, see below. However, I was looking for suggestions on how to make the program more efficient.

    gen obs = _n


    gen bad = 0 if acc = 0
    gen obs = _n
    glevelsof state if acc == 1
    local stloop "`r(levels)'"

    foreach s in `stloop' {

    glevels of obs if state == `s' & acc == 1
    local accit "`r(levels)'"
    glevelsof obs if fstate == `s' if acc == 0
    local nsuit "`r(levels)'"

    foreach i in `nsuit' {
    foreach ac in `accit' {
    if x[`i'] > x[`ac'] & y[`i'] > y[`ac'] {
    replace bad = 1 if obs == `i'
    continue ,break
    }
    }
    }

    }

  • #2
    This doesn't require any explicit loops in Stata.

    Code:
    preserve
    keep if acc == 0
    drop acc
    rename (obs x y) =_r
    tempfile rejected
    save `rejected'
    
    restore
    keep if acc == 1
    drop acc
    rename (obs x y) =_a
    joinby state using `rejected'
    
    by obs_r, sort: egen bad = max((y_a < y_r) & (x_a < x_r) & !missing(x_r, y_r))
    
    by obs_r: keep if _n == 1
    drop x* y* obs_a
    This code will leave in memory a data set containing the values of all rejected obs and an indication of whether they are "bad" or not.

    This would have been simpler if you had posted example data. Because you didn't, this code is untested and may contain errors, though I am confident that the gist of it is correct. You may need to tweak it to fix any typos or other things I've missed in it.

    In the future, always show example data when asking for help with code. And always use the -dataex- command to do that. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.
    Last edited by Clyde Schechter; 09 Aug 2022, 16:01.

    Comment


    • #3
      Thank you so much, and I will include the data next time.

      Comment


      • #4
        I have never seen a command like "rename (obs x y) =_r"
        I tried to find some documentation online, but I can't seem to be able to find the right keyword. It seems to be similar to egen .. = group(obs x y). Any suggestions of where to find this discussed?

        Comment


        • #5
          h rename

          Comment


          • #6
            Code:
            help rename group

            Comment


            • #7
              I guess it did not realized that you could give a group of variables one name. I knew one could do that with local variables.

              Comment


              • #8
                Originally posted by Richard Thomas Boylan View Post
                I guess it did not realized that you could give a group of variables one name. I knew one could do that with local variables.
                -rename group- does not rename a group of variables to a name, but rather to a group of names. And the command is relatively modern, must have popped up in Stata 12 or about this time.

                Comment


                • #9
                  I guess I am still confused because in the program by Clyde Schechter he writes
                  rename (obs x y) =_a So, I see three variables on the LHS of the = sign and only one variable on the RHS of the = sign.

                  Comment


                  • #10
                    that's because you did not read the help file - go to the help file shown in #6 above and look at the point # 11 and make sure you read the entire point

                    Comment


                    • #11
                      Thanks

                      Comment


                      • #12
                        I checked and Clyde Schechter gives exactly the same answer as going through loops. Thanks again!

                        Comment

                        Working...
                        X