Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a new variable that combines two variables on STATA 17

    I have two variables, q5 and q6. I would like to keep all values from q5 and some values from q6 and create a new variable (q5new) that has all the values I want. For example, my new variable would have 31023 observations for q5new==2. So, I kept only the values I need in q6 like this:
    HTML Code:
    egen q6_new = anymatch(q6), values(2, 5, 8, 9, 12,16,17,18,19,20,21,26,27,29,30,32,36,38,39,45,46,47,50) 
    keep if q6_new
    HTML Code:
    . tab q5, nol
    
               |      Freq.     Percent        Cum.
    ------------+-----------------------------------
              1 |     29,899       41.05       41.05
              2 |      2,356        3.24       44.29
              3 |      1,008        1.38       45.67
              4 |      5,274        7.24       52.92
              5 |      2,089        2.87       55.78
              6 |      1,221        1.68       57.46
              7 |      1,398        1.92       59.38
              8 |      1,820        2.50       61.88
              9 |      7,781       10.68       72.56
             10 |     18,882       25.93       98.49
             11 |      1,100        1.51      100.00
    ------------+-----------------------------------
          Total |     72,828      100.00
    
    
    . tab q6, nol
    
                 |    Freq.     Percent        Cum.
    ------------+-----------------------------------
              2 |      1,124       23.78       23.78
              5 |         47        0.99       24.78
              8 |         84        1.78       26.56
              9 |         81        1.71       28.27
             12 |        111        2.35       30.62
             16 |         50        1.06       31.68
             17 |        141        2.98       34.66
             18 |         62        1.31       35.97
             19 |        171        3.62       39.59
             20 |         86        1.82       41.41
             21 |        133        2.81       44.22
             26 |        655       13.86       58.08
             27 |         29        0.61       58.70
             29 |         80        1.69       60.39
             30 |         81        1.71       62.10
             32 |        168        3.55       65.66
             36 |         70        1.48       67.14
             38 |        100        2.12       69.26
             39 |        102        2.16       71.41
             45 |         80        1.69       73.11
             46 |        931       19.70       92.81
             47 |         81        1.71       94.52
             50 |        259        5.48      100.00
    ------------+-----------------------------------
          Total |      4,726      100.00
    
    Then I tried combining the two variable by using stack command:
    HTML Code:
    stack q5 q6_new, into(q5new)
    
    . tab q5new
    
                   |      Freq.     Percent        Cum.
    ------------+-----------------------------------
              1 |      4,726       50.00       50.00
             10 |      4,726       50.00      100.00
    ------------+-----------------------------------
          Total |      9,452      100.00
    But the variable it generated only has two values rather than the desired 34. How can I fix this?


  • #2
    I'm very confused, can you post your sample data as well as what you expect to find please?

    Comment


    • #3
      I share Jared's confusion. You will need to provide a much clearer explanation of what you have, and what you want to happen, along with some example data to work with. At the moment, it's as if us readers have stumbled into the middle of a discussion you are having with your data.

      Comment


      • #4
        When you did
        Code:
        keep if q6_new
        only 4756 observations remained, and all of them had q5 == 10 and q6_new == 1

        Then you did
        Code:
        stack q5 q6_new, into(q5new)
        when you meant to do
        Code:
        stack q5 q6, into(q5new)
        and so you got values of q6_new (all of which are 1) rather than q6.

        The problem is that keep and drop eliminate entire observations, so you don't get all the values of q5 that you want. This untested code might start you in a useful direction. It assumes that none of the values of q5 are missing.
        Code:
        replace q6 = . if inlist(2, 5, 8, 9, 12,16,17,18,19,20,21,26,27,29,30,32,36,38,39,45,46,47,50) == 0
        stack q5 q6, into(q5new)
        drop if q5new = .
        tab q5nes

        Comment

        Working...
        X