Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Find the the variable with highest value and mark this as 1, while the other variables are marked 0

    Hello all,
    I have a data set that looks like this:

    Code:
    input int year float(A B C D E)
    1950   .3488717  .9319346  .9011049   .6964867   .3295547
    1951   .2668857  .4548882 .26436493   .9119344   .4144089
    1952   .1366463  .0674011  .8856509   .6795634 .036084738
    1953 .028556867  .3379889   .882112   .3549416  .08438109
    1954   .8689333  .9748848   .748933     .73897 .009876247
    1955   .3508549  .7264384  .9196262  .18740167   .3200437
    1956  .07110509 .04541512  .6934533   .3146128 .005196966
    1957  .32336795  .7459667  .2154026   .1375693  .22754347
    1958   .5551032  .4961259  .8285888   .6537739    .851468
    1959    .875991  .7167162 .04421536  .27013195   .9820066
    1960  .20470947   .859742  .8630378   .8998394 .032479186
    1961   .8927587 .13407555  .3526046   .5734232   .9874847
    1962   .5844658 .48844185  .7720399  .11147037    .894106
    1963   .3697791  .8712187  .5861199   .4145227   .9684734
    1964   .8506309  .7664683  .3227766 .003052204  .23922028
    1965   .3913819 .25125554 .17293066   .6659978   .6927336
    1966  .11966132 .16636477  .8053644   .3462876   .4884359
    1967   .7542434  .7437958  .3060019   .0780235   .4376452
    1968   .6950234  .9805113 .21909967  .12758136   .5858005
    1969   .6866152  .7295772   .724731   .2297006   .3787092
    I want to find which variable has the highest value in each year and mark this as 1, while the other variables are marked 0.
    The data set should then look like this:

    Code:
    input int year float(A B C D E)
    1950   0 1 0 0 0 
    1951   0 0 0 1 0
    1952   0 0 1 0 0
    I've found some user written commands online that can help me with this. The problem is I have to do the code in a language that is based on stata. That is, most usual commands works, but user written commands does not.
    What I think could work is something like this: generate A_biggest == 1 if A > B;C;D;E
    But my lack of coding skills makes this a bit difficult for me.

    Does anyone have any suggestions of how I can manage this?

    Best regards
    Andreas Lille

  • #2
    Andreas:
    welcome to this forum.
    You may want to consider the following code:
    Code:
     foreach var of varlist A-E {
      2.
    .                 gen wanted_`var'=.
      3.
    .                }
    
    . foreach var of varlist A-E {
      2. replace wanted_`var'=1 if `var'==max(A, B, C, D, E)
      3.                }
    
    
    . foreach var of varlist wanted_A - wanted_E {
      2. replace `var'=0 if `var'==.
      3.                }
    
    . list wanted_A- wanted_E
    
         +------------------------------------------------------+
         | wanted_A   wanted_B   wanted_C   wanted_D   wanted_E |
         |------------------------------------------------------|
      1. |        0          1          0          0          0 |
      2. |        0          0          0          1          0 |
      3. |        0          0          1          0          0 |
      4. |        0          0          1          0          0 |
      5. |        0          1          0          0          0 |
         |------------------------------------------------------|
      6. |        0          0          1          0          0 |
      7. |        0          0          1          0          0 |
      8. |        0          1          0          0          0 |
      9. |        0          0          0          0          1 |
     10. |        0          0          0          0          1 |
         |------------------------------------------------------|
     11. |        0          0          0          1          0 |
     12. |        0          0          0          0          1 |
     13. |        0          0          0          0          1 |
     14. |        0          0          0          0          1 |
     15. |        1          0          0          0          0 |
         |------------------------------------------------------|
     16. |        0          0          0          0          1 |
     17. |        0          0          1          0          0 |
     18. |        1          0          0          0          0 |
     19. |        0          1          0          0          0 |
     20. |        0          1          0          0          0 |
         +------------------------------------------------------+
    
    .
    
    
    *Caveat emptor: the code ignores missing values in A-E variables*
    .
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Here's another way to do it:

      Code:
      clear 
      input int year float(A B C D E)
      1950   .3488717  .9319346  .9011049   .6964867   .3295547
      1951   .2668857  .4548882 .26436493   .9119344   .4144089
      1952   .1366463  .0674011  .8856509   .6795634 .036084738
      1953 .028556867  .3379889   .882112   .3549416  .08438109
      1954   .8689333  .9748848   .748933     .73897 .009876247
      1955   .3508549  .7264384  .9196262  .18740167   .3200437
      1956  .07110509 .04541512  .6934533   .3146128 .005196966
      1957  .32336795  .7459667  .2154026   .1375693  .22754347
      1958   .5551032  .4961259  .8285888   .6537739    .851468
      1959    .875991  .7167162 .04421536  .27013195   .9820066
      1960  .20470947   .859742  .8630378   .8998394 .032479186
      1961   .8927587 .13407555  .3526046   .5734232   .9874847
      1962   .5844658 .48844185  .7720399  .11147037    .894106
      1963   .3697791  .8712187  .5861199   .4145227   .9684734
      1964   .8506309  .7664683  .3227766 .003052204  .23922028
      1965   .3913819 .25125554 .17293066   .6659978   .6927336
      1966  .11966132 .16636477  .8053644   .3462876   .4884359
      1967   .7542434  .7437958  .3060019   .0780235   .4376452
      1968   .6950234  .9805113 .21909967  .12758136   .5858005
      1969   .6866152  .7295772   .724731   .2297006   .3787092
      end 
      
      // starting guess: A is the maximum 
      gen which = "A"
      gen max = A 
      
      * update maximum if needed 
      quietly foreach v in B C D E { 
          replace max = max(max, `v')
          replace which = "`v'" if max == `v'
      } 
      
      tab which, gen(is)
      rename (is?) (isA isB isC isD isE)
      l which is* 
      
           +-------------------------------------+
           | which   isA   isB   isC   isD   isE |
           |-------------------------------------|
        1. |     B     0     1     0     0     0 |
        2. |     D     0     0     0     1     0 |
        3. |     C     0     0     1     0     0 |
        4. |     C     0     0     1     0     0 |
        5. |     B     0     1     0     0     0 |
           |-------------------------------------|
        6. |     C     0     0     1     0     0 |
        7. |     C     0     0     1     0     0 |
        8. |     B     0     1     0     0     0 |
        9. |     E     0     0     0     0     1 |
       10. |     E     0     0     0     0     1 |
           |-------------------------------------|
       11. |     D     0     0     0     1     0 |
       12. |     E     0     0     0     0     1 |
       13. |     E     0     0     0     0     1 |
       14. |     E     0     0     0     0     1 |
       15. |     A     1     0     0     0     0 |
           |-------------------------------------|
       16. |     E     0     0     0     0     1 |
       17. |     C     0     0     1     0     0 |
       18. |     A     1     0     0     0     0 |
       19. |     B     0     1     0     0     0 |
       20. |     B     0     1     0     0     0 |
           +-------------------------------------+

      Comment

      Working...
      X