Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • For each observation find the variable with min value

    Hello everybody,

    I am working on a dataset in which there is a number of variables reporting to distance between the obs position and a specific party position on a given issue. Each column corresponds to a specific political party and in each cell I have the difference between the obs and the party. Now, I want to generate a variable recording the min value among those that are available in each row and a new variable in which I record the name of the party that is the closest one to the obs. I tried to run this command:

    egen min=rowmin(distance*) > gen min_var = "" > foreach v of varlist distance* { > replace min_var = "`v'" if `v' == min > } Indeed, it works, but it does not handle with ties. How to deal with the situation in which the min differences for one specific observation are the same? Thank you Davide

  • #2
    Please edit your code using CODE delimiters and line breaks. Just click on the A icon to get access to the # key that inserts delimiters.

    What do you want to happen when there are ties?

    Comment


    • #3
      These are the commands I used to get two variables (1- Min value; 2- Closest party)


      egen min=rowmin(distance*)
      gen min_var=""
      foreach v of varlist distance* {
      replace min_var = "`v'" if `v' == min
      }

      In case of ties, I would like to select the party having the highest value in another scale I already have in the dataset. Let's say that the min. distance between the obs and Party1 and Party2 is the same, I would like to select and record in another variable the party that has the highest score on a different scale. In my dataset this variable is called Entrepreneurship and I have one variable for each political party (Entrepreneurship1; Entrpreneurship2;...).

      Thank you for your help

      Comment


      • #4
        This may help with technique. We use the sort order to break ties. As you want minimum distance yet (in the event of ties) maximum entrepreneurship the sorting is on distance and negated entrepreneurship. Study the list results to see what happens at each stage. If puzzled, add yet more list commands.

        Code:
        clear
        input id d1 d2 e1 e2 
        1 2 2 1 2 
        2 3 4 1 2 
        end 
        
        reshape long d e, i(id)  
        list 
        
        gen nege = -e 
        bysort id (d nege): gen minvar = _j[1] 
        drop nege 
        
        reshape wide d e, i(id) j(_j) 
        list

        Comment


        • #5
          Thank you for your reply Mr. Cox. Unfortunately, your solution is still unclear to me. I try to be more precise. The rationale behind these operations in stata is that I want to know:

          1- The min difference on a given scale between each observation and a series of political parties on a given issue (left/right position);
          2- The name of the party that is consequentially the closest one to the observation.

          I would like to end up with two variables: the first one reporting for each observation the min distance (difference) from a given party; the second one reporting the name of the party that is the closest one to the obs.

          The structure of the dataset is the following: I have more than 20000 observations (respondents to a survey) and I have included a series of variables reporting the position of a number of political parties on a given issue. So, I have a column for each single party reporting the position of the party itself.

          I can calculate the distance of respondents from each political party on a given issue (say, left/right position). I did it and I got a number of variables that is equal to the overall number of parties for which I have the data. Each of these variables reports the distance between the obs and the party. Then I want to generate a variable reporting the min values among the differences I calculated before (the min value ofr each observation) and a variable reporting the name of the party for which the difference is the lowest.
          In case of two equal values, I want to report the name of the party that has the highest score on another scale (Entrepreneurship).
          I was wondering if there is a more intuitive way to get these results.

          Thank you again,
          Davide

          Comment


          • #6
            Sorry, but a long word-based explanation isn't more precise or clearer to me (and as no one else has jumped in others may agree). I think you would be better off giving a realistic example and showing numerically what you want.

            Comment


            • #7
              Sorry for bothering you, I understand it might be not really clear what I am looking for. I try to give a clear example.
              This is the kind of dataset I have. Let's take as an example Obs 1. The left/right position for Obs 1 is equal to 2. The difference between the position of Party 1 and Obs. 1 is 2;the difference between the position between Party 2 and Obs. 1 is 5. The difference between Party3 and Obs 1 is 2. Now, I want to report for Obs 1 the min value between its position and the position of the parties included in the dataset. The min value here is 2 and I want this value being reported in the column "Min value". Also, I want to report the name of the party that is the closest one to observation 1. In this example, there are two parties with the same distance from Obs 1: Party1 and Party3. All this given and considering only the two parties with the lowest distance from obs 1, I want to report the name of the party that has the highest value of Entrepreneurship. In this example Party 3 has the highest Entrepreneurship score. So, for Obs 1 I will have these results: Min value=2; Closest Party: Party3. Now, I would like to obtain these results with stata. Hope this might be more clear than my previous explanations. Thank you for your patience!
              Obs Left/Right position Obs (0-10 scale) Left/Right position Party 1 Left/Right position Party2 Left/Right position Party3 Entrep. Party 1 Entrep. Party 2 Entrep. Party 3 Min. Value Closest Party
              1 2 4 7 4 2 3 4
              2 4 4 7 4 2 3 4
              3 6 4 7 4 2 3 4
              4 5 4 7 4 2 3 4
              5 7 4 7 4 2 3 4

              Comment


              • #8
                Thanks for the example. The main trick here is exactly as in #4. Here's all my code first and then a version with output. Please note the use of dataex (SSC).

                Code:
                * Example generated by -dataex-. To install: ssc install dataex
                clear
                input byte(obs position Party1 Party2 Party3 Entrep1 Entrep2 Entrep3)
                1 2 4 7 4 2 3 4 
                2 4 4 7 4 2 3 4 
                3 6 4 7 4 2 3 4 
                4 5 4 7 4 2 3 4 
                5 7 4 7 4 2 3 4 
                end
                
                * sorting is much easier after a -reshape- 
                reshape long Party Entrep, i(obs) j(which) 
                
                * note the abs() as I presume that the difference between 2 and 7 is to be treated as equivalent to that between 7 and 2 
                gen Min = abs(pos - Party)
                
                * we'll want to sort the lowest Min and the highest Entrep to the same place; we do that by negating Entrep 
                gen NegEntrep = - Entrep 
                
                sort obs Min NegE 
                
                list , sepby(obs) 
                
                by obs : replace Min = Min[1] 
                by obs : gen Closest = which[1] 
                
                list, sepby(obs) 
                
                drop NegE 
                reshape wide Party Entrep , i(obs) j(which) 
                
                rename Party* P* 
                rename Entrep* E* 
                list obs pos P* E* Min Closest
                Code:
                . * Example generated by -dataex-. To install: ssc install dataex
                . clear
                
                . input byte(obs position Party1 Party2 Party3 Entrep1 Entrep2 Entrep3)
                
                          obs  position    Party1    Party2    Party3   Entrep1   Entrep2   Entrep3
                  1. 1 2 4 7 4 2 3 4 
                  2. 2 4 4 7 4 2 3 4 
                  3. 3 6 4 7 4 2 3 4 
                  4. 4 5 4 7 4 2 3 4 
                  5. 5 7 4 7 4 2 3 4 
                  6. end
                
                 
                . reshape long Party Entrep, i(obs) j(which) 
                (note: j = 1 2 3)
                
                Data                               wide   ->   long
                -----------------------------------------------------------------------------
                Number of obs.                        5   ->      15
                Number of variables                   8   ->       5
                j variable (3 values)                     ->   which
                xij variables:
                                   Party1 Party2 Party3   ->   Party
                                Entrep1 Entrep2 Entrep3   ->   Entrep
                -----------------------------------------------------------------------------
                
                 
                . gen Min = abs(pos - Party)
                
                . gen NegEntrep = - Entrep 
                
                
                . sort obs Min NegE 
                
                 
                . list , sepby(obs) 
                
                     +----------------------------------------------------------+
                     | obs   which   position   Party   Entrep   Min   NegEnt~p |
                     |----------------------------------------------------------|
                  1. |   1       3          2       4        4     2         -4 |
                  2. |   1       1          2       4        2     2         -2 |
                  3. |   1       2          2       7        3     5         -3 |
                     |----------------------------------------------------------|
                  4. |   2       3          4       4        4     0         -4 |
                  5. |   2       1          4       4        2     0         -2 |
                  6. |   2       2          4       7        3     3         -3 |
                     |----------------------------------------------------------|
                  7. |   3       2          6       7        3     1         -3 |
                  8. |   3       3          6       4        4     2         -4 |
                  9. |   3       1          6       4        2     2         -2 |
                     |----------------------------------------------------------|
                 10. |   4       3          5       4        4     1         -4 |
                 11. |   4       1          5       4        2     1         -2 |
                 12. |   4       2          5       7        3     2         -3 |
                     |----------------------------------------------------------|
                 13. |   5       2          7       7        3     0         -3 |
                 14. |   5       3          7       4        4     3         -4 |
                 15. |   5       1          7       4        2     3         -2 |
                     +----------------------------------------------------------+
                
                 
                . by obs : replace Min = Min[1] 
                (7 real changes made)
                
                . by obs : gen Closest = which[1] 
                
                 
                . list, sepby(obs) 
                
                     +--------------------------------------------------------------------+
                     | obs   which   position   Party   Entrep   Min   NegEnt~p   Closest |
                     |--------------------------------------------------------------------|
                  1. |   1       3          2       4        4     2         -4         3 |
                  2. |   1       1          2       4        2     2         -2         3 |
                  3. |   1       2          2       7        3     2         -3         3 |
                     |--------------------------------------------------------------------|
                  4. |   2       3          4       4        4     0         -4         3 |
                  5. |   2       1          4       4        2     0         -2         3 |
                  6. |   2       2          4       7        3     0         -3         3 |
                     |--------------------------------------------------------------------|
                  7. |   3       2          6       7        3     1         -3         2 |
                  8. |   3       3          6       4        4     1         -4         2 |
                  9. |   3       1          6       4        2     1         -2         2 |
                     |--------------------------------------------------------------------|
                 10. |   4       3          5       4        4     1         -4         3 |
                 11. |   4       1          5       4        2     1         -2         3 |
                 12. |   4       2          5       7        3     1         -3         3 |
                     |--------------------------------------------------------------------|
                 13. |   5       2          7       7        3     0         -3         2 |
                 14. |   5       3          7       4        4     0         -4         2 |
                 15. |   5       1          7       4        2     0         -2         2 |
                     +--------------------------------------------------------------------+
                
                 
                . drop NegE 
                
                . reshape wide Party Entrep , i(obs) j(which) 
                (note: j = 1 2 3)
                
                Data                               long   ->   wide
                -----------------------------------------------------------------------------
                Number of obs.                       15   ->       5
                Number of variables                   7   ->      10
                j variable (3 values)             which   ->   (dropped)
                xij variables:
                                                  Party   ->   Party1 Party2 Party3
                                                 Entrep   ->   Entrep1 Entrep2 Entrep3
                -----------------------------------------------------------------------------
                
                 
                . rename Party* P* 
                
                . rename Entrep* E* 
                
                . list obs pos P* E* Min Closest 
                
                     +--------------------------------------------------------------+
                     | obs   position   P1   P2   P3   E1   E2   E3   Min   Closest |
                     |--------------------------------------------------------------|
                  1. |   1          2    4    7    4    2    3    4     2         3 |
                  2. |   2          4    4    7    4    2    3    4     0         3 |
                  3. |   3          6    4    7    4    2    3    4     1         2 |
                  4. |   4          5    4    7    4    2    3    4     1         3 |
                  5. |   5          7    4    7    4    2    3    4     0         2 |
                     +--------------------------------------------------------------+

                Comment

                Working...
                X