Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Create an aggregate variable for which its observation is all the variable names that had a "yes"

    I have data formatted as in the dataex below.
    What I want is to create one variable that will combine the A B C D E if they are "Yes", but I want the new variable to give the names of the combined variables.
    for example, the new variable for id 12 would have "A, C", id 14 would have "A, B, C", and id 30 would have "B, C, D"
    Is there a quick way of doing this, or do I have to write a lot of 'replace' code for the combinations?

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float id str3(A B C D E)
      1 "Yes" ""    ""    ""    ""   
      2 ""    ""    "Yes" ""    ""   
      3 ""    ""    "Yes" ""    ""   
      4 ""    ""    "Yes" ""    ""   
      5 ""    ""    "Yes" ""    ""   
      6 "Yes" ""    "Yes" ""    ""   
      7 ""    ""    "Yes" ""    ""   
      8 ""    ""    "Yes" ""    ""   
      9 ""    ""    "Yes" ""    ""   
     10 "Yes" ""    "Yes" ""    ""   
     11 ""    ""    "Yes" ""    ""   
     12 "Yes" ""    "Yes" ""    ""   
     13 "Yes" "Yes" "Yes" "Yes" ""   
     14 "Yes" "Yes" "Yes" ""    ""   
     15 "Yes" ""    "Yes" ""    ""   
     16 ""    ""    "Yes" ""    ""   
     17 ""    "Yes" "Yes" ""    ""   
     18 ""    ""    "Yes" ""    ""   
     19 ""    ""    "Yes" ""    ""   
     20 "Yes" "Yes" "Yes" "Yes" ""   
     21 "Yes" ""    "Yes" ""    ""   
     22 ""    ""    "Yes" ""    ""   
     23 ""    ""    "Yes" ""    ""   
     24 "Yes" "Yes" "Yes" "Yes" ""   
     25 ""    ""    "Yes" ""    ""   
     26 ""    ""    "Yes" ""    ""   
     27 ""    "Yes" "Yes" ""    ""   
     28 ""    "Yes" "Yes" ""    ""   
     29 ""    ""    "Yes" ""    ""   
     30 ""    "Yes" "Yes" "Yes" ""   
     31 "Yes" "Yes" "Yes" ""    ""   
     32 ""    ""    "Yes" ""    ""   
     33 ""    ""    "Yes" ""    ""   
     34 "Yes" "Yes" "Yes" ""    ""   
     35 ""    ""    ""    ""    ""   
     36 ""    ""    "Yes" ""    ""   
     37 "Yes" "Yes" "Yes" "Yes" ""   
     38 ""    ""    "Yes" ""    ""   
     39 "Yes" "Yes" "Yes" ""    ""   
     40 "Yes" "Yes" "Yes" ""    ""   
     41 "Yes" "Yes" "Yes" ""    ""   
     42 "Yes" ""    "Yes" ""    ""   
     43 "Yes" "Yes" "Yes" ""    ""   
     44 "Yes" "Yes" "Yes" ""    ""   
     45 "Yes" "Yes" "Yes" ""    ""   
     46 "Yes" "Yes" "Yes" "Yes" ""   
     47 "Yes" "Yes" "Yes" ""    ""   
     48 "Yes" "Yes" "Yes" ""    ""   
     49 "Yes" "Yes" "Yes" "Yes" ""   
     50 "Yes" "Yes" "Yes" ""    ""   
     51 "Yes" "Yes" "Yes" "Yes" ""   
     52 "Yes" "Yes" "Yes" ""    ""   
     53 "Yes" "Yes" "Yes" ""    ""   
     54 "Yes" "Yes" "Yes" ""    ""   
     55 ""    ""    "Yes" ""    ""   
     56 ""    ""    ""    ""    ""   
     57 "Yes" ""    "Yes" ""    ""   
     58 "Yes" "Yes" "Yes" ""    ""   
     59 "Yes" "Yes" "Yes" ""    ""   
     60 "Yes" "Yes" "Yes" ""    ""   
     61 "Yes" "Yes" "Yes" ""    ""   
     62 "Yes" "Yes" "Yes" ""    ""   
     63 "Yes" "Yes" "Yes" ""    ""   
     64 "Yes" ""    "Yes" ""    ""   
     65 "Yes" ""    "Yes" ""    ""   
     66 "Yes" "Yes" "Yes" "Yes" ""   
     67 "Yes" ""    "Yes" ""    ""   
     68 "Yes" "Yes" "Yes" ""    ""   
     69 "Yes" "Yes" "Yes" ""    ""   
     70 "Yes" "Yes" "Yes" ""    ""   
     71 "Yes" "Yes" "Yes" ""    ""   
     72 "Yes" "Yes" "Yes" ""    ""   
     73 "Yes" "Yes" "Yes" ""    ""   
     74 "Yes" "Yes" "Yes" ""    ""   
     75 "Yes" "Yes" "Yes" "Yes" "Yes"
     76 "Yes" "Yes" "Yes" ""    ""   
     77 "Yes" ""    "Yes" ""    ""   
     78 ""    ""    "Yes" ""    ""   
     79 "Yes" ""    "Yes" ""    ""   
     80 "Yes" "Yes" "Yes" ""    ""   
     81 "Yes" "Yes" "Yes" "Yes" ""   
     82 "Yes" "Yes" "Yes" ""    ""   
     83 "Yes" "Yes" "Yes" ""    ""   
     84 ""    "Yes" "Yes" ""    ""   
     85 ""    ""    ""    ""    ""   
     86 "Yes" "Yes" "Yes" ""    ""   
     87 "Yes" "Yes" "Yes" ""    ""   
     88 "Yes" ""    "Yes" ""    ""   
     89 "Yes" "Yes" "Yes" "Yes" ""   
     90 "Yes" "Yes" "Yes" "Yes" ""   
     91 "Yes" "Yes" "Yes" ""    ""   
     92 ""    ""    ""    ""    ""   
     93 ""    ""    ""    ""    ""   
     94 "Yes" ""    "Yes" ""    ""   
     95 "Yes" "Yes" "Yes" ""    ""   
     96 "Yes" ""    "Yes" ""    ""   
     97 "Yes" "Yes" "Yes" ""    ""   
     98 ""    ""    "Yes" ""    ""   
     99 "Yes" "Yes" "Yes" ""    ""   
    100 ""    ""    "Yes" ""    ""   
    end

  • #2
    Maybe try
    Code:
    quietly generate str new_variable = ""
    foreach var of varlist A-E {
        quietly replace new_variable = new_variable + "`var'" if strtrim(strlower(`var')) == "yes"
    }
    Strikes me as a dead-end task, though.

    Comment


    • #3
      Didn't notice that you wanted commas & spaces.
      Code:
      quietly generate str new_variable = ""
      foreach var of varlist A-E {
          quietly replace new_variable = new_variable + "`var' " if strtrim(strlower(`var')) == "yes"
      }
      quietly replace new_variable = subinstr(strtrim(new_variable), " ", ", ", .)

      Comment


      • #4
        Here's another way to do it.


        Code:
        gen wanted = "" 
        
        foreach v in A B C D E {
            replace wanted = cond(wanted  == "", "`v'", wanted + ", `v'") if `v' == "Yes"
        } 
        
        tab wanted 
        
               wanted |      Freq.     Percent        Cum.
        --------------+-----------------------------------
                    A |          1        1.05        1.05
              A, B, C |         38       40.00       41.05
           A, B, C, D |         11       11.58       52.63
        A, B, C, D, E |          1        1.05       53.68
                 A, C |         15       15.79       69.47
                 B, C |          4        4.21       73.68
              B, C, D |          1        1.05       74.74
                    C |         24       25.26      100.00
        --------------+-----------------------------------
                Total |         95      100.00
        as the logic can be reduced to

        1. if this variable is the first noticed, write its name

        2. otherwise if any variable has previously been noticed, write comma plus space plus its name.

        Comment


        • #5
          Thank you both, these worked perfectly

          Comment

          Working...
          X