Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Replacing observations from different variables at one time

    Hello everyone,

    I have data that looks like this. (This is not the actual data. I just create this as an example to make it easier to understand.)

    . list name x1 x2 x3 x4 x5

    +---------------------------------+
    | name x1 x2 x3 x4 x5 |
    |---------------------------------|
    1. | Johnny .8 .1 .6 .1 .2 |
    2. | Johnny .7 .3 1 .5 .4 |
    3. | Johnny .3 1 .2 .8 .5 |
    4. | Johnny .7 .4 .9 1 .9 |
    5. | Johnny .5 .8 .6 .3 .2 |
    +---------------------------------+



    There is a name variable and x1-x5 variables. I want to replace the observations based on this rule:
    "Once x consists of the values of 1, all of the values for the next x's will be zero"

    Simply put, I want my data to look like this:
    . list name x1 x2 x3 x4 x5

    +---------------------------------+
    | name x1 x2 x3 x4 x5 |
    |---------------------------------|
    1. | Johnny .8 .1 .6 .1 .2 |
    2. | Johnny .7 .3 1 0 0 |
    3. | Johnny .3 1 0 0 0 |
    4. | Johnny .7 .4 .9 1 0 |
    5. | Johnny .5 .8 .6 .3 .2 |
    +---------------------------------+



    How do I command Stata to do that? I have pretty big data and it will be so time-consuming if I replace them one by one.


    Thank you in advance.

  • #2
    Please do read and follow https://www.statalist.org/forums/help#stata 12.2 and use dataex to give examples. Yours is clear enough but requires editing work to get into shape to show solutions. The first block of code below comes from dataex.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte id str6 name double(x1 x2 x3 x4 x5)
    1 "Johnny" .8 .1 .6 .1 .2
    2 "Johnny" .7 .3  1 .5 .4
    3 "Johnny" .3  1 .2 .8 .5
    4 "Johnny" .7 .4 .9  1 .9
    5 "Johnny" .5 .8 .6 .3 .2
    end
    
    gen is1 = 0 
    
    forval j = 1/5 { 
        replace x`j' = 0 if is1 
        replace is1 = 1 if x`j' == 1 
    }
    
    list 
    
    
         +--------------------------------------------+
         | id     name   x1   x2   x3   x4   x5   is1 |
         |--------------------------------------------|
      1. |  1   Johnny   .8   .1   .6   .1   .2     0 |
      2. |  2   Johnny   .7   .3    1    0    0     1 |
      3. |  3   Johnny   .3    1    0    0    0     1 |
      4. |  4   Johnny   .7   .4   .9    1    0     1 |
      5. |  5   Johnny   .5   .8   .6   .3   .2     0 |
         +--------------------------------------------+
    That said, it seems that you are using wide layout when a long layout is likely to make your Stata life easier. Are many later problems going to require a loop over variables?


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte id str6 name double(x1 x2 x3 x4 x5)
    1 "Johnny" .8 .1 .6 .1 .2
    2 "Johnny" .7 .3  1 .5 .4
    3 "Johnny" .3  1 .2 .8 .5
    4 "Johnny" .7 .4 .9  1 .9
    5 "Johnny" .5 .8 .6 .3 .2
    end
    
    reshape long x, i(id) j(j)
    
    gen is1 = x == 1 
    
    bysort name id (j) : replace is1 = 1 if is1[_n-1] == 1 
    
    by name id: replace x = 0 if is1[_n-1] == 1 
    
    list, sepby(name id)
    
         +----------------------------+
         | id   j     name    x   is1 |
         |----------------------------|
      1. |  1   1   Johnny   .8     0 |
      2. |  1   2   Johnny   .1     0 |
      3. |  1   3   Johnny   .6     0 |
      4. |  1   4   Johnny   .1     0 |
      5. |  1   5   Johnny   .2     0 |
         |----------------------------|
      6. |  2   1   Johnny   .7     0 |
      7. |  2   2   Johnny   .3     0 |
      8. |  2   3   Johnny    1     1 |
      9. |  2   4   Johnny    0     1 |
     10. |  2   5   Johnny    0     1 |
         |----------------------------|
     11. |  3   1   Johnny   .3     0 |
     12. |  3   2   Johnny    1     1 |
     13. |  3   3   Johnny    0     1 |
     14. |  3   4   Johnny    0     1 |
     15. |  3   5   Johnny    0     1 |
         |----------------------------|
     16. |  4   1   Johnny   .7     0 |
     17. |  4   2   Johnny   .4     0 |
     18. |  4   3   Johnny   .9     0 |
     19. |  4   4   Johnny    1     1 |
     20. |  4   5   Johnny    0     1 |
         |----------------------------|
     21. |  5   1   Johnny   .5     0 |
     22. |  5   2   Johnny   .8     0 |
     23. |  5   3   Johnny   .6     0 |
     24. |  5   4   Johnny   .3     0 |
     25. |  5   5   Johnny   .2     0 |
         +----------------------------+

    Comment


    • #3
      Dear Nick, Thank's a lot for your help.
      Unfortunately, I can't reshape it into long data because the actual data is pretty complex. Is there any way to do it without having to reshape it first?
      Last edited by Rayinda Putri; 04 May 2022, 02:03.

      Comment


      • #4
        Is there any way to do it without having to reshape it first?
        Yes; that is precisely what the first part of #2 gives you.

        Comment


        • #5
          [EDITED]
          Dear Nick,
          You were right. However, I think I oversimplified my data. Here I have another piece of data (not the actual data but I hope this time it is not oversimplified),
          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input str6 name_1 double x1_1 float decisionx1_1 double x2_1 float decisionx2_1 str4 name_2 double x1_2 float decisionx1_2 double x2_2 float decisionx2_2 str4 name_3 double x1_3 float decisionx1_3 double x2_3 float decisionx2_3
          "Johnny" .8 0 .1 0 "Luke" .2 1 .3 0 "Phil" .2 0 .2 0
          "Johnny" .7 0 .3 0 "Luke" .2 0 .8 0 "Phil" .7 0 .2 1
          "Johnny" .3 0 .5 1 "Luke" .3 1 .1 0 "Phil" .4 0 .2 0
          "Johnny" .7 0 .4 0 "Luke" .4 0 .1 0 "Phil" .4 0 .5 0
          "Johnny" .5 0 .8 0 "Luke" .3 0 .8 1 "Phil" .3 0 .5 0
          end
          The goal is, once the x of the first sample has "yes" decision, the next x's of the same sample must be 0.
          The goal is to make the data look like this:
          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input str6 name_1 double x1_1 float decisionx1_1 double x2_1 float decisionx2_1 str4 name_2 double x1_2 float decisionx1_2 double x2_2 float decisionx2_2 str4 name_3 double x1_3 float decisionx1_3 double x2_3 float decisionx2_3
          "Johnny" .8 0 .1 0 "Luke" .2 1 .3 0 "Phil"  0 0 .2 0
          "Johnny" .7 0 .3 0 "Luke" .2 0 .8 0 "Phil" .7 0 .2 1
          "Johnny" .3 0 .5 1 "Luke" .3 1  0 0 "Phil" .4 0  0 0
          "Johnny" .7 0 .4 0 "Luke" .4 0 .1 0 "Phil" .4 0 .5 0
          "Johnny" .5 0 .8 0 "Luke" .3 0 .8 1 "Phil" .3 0 .5 0
          end
          On the third row, "yes" for x2_1 (decisionx2_1 = 1), therefore x2_2 (Luke) and x2_3 (Phil) must been replaced by 0.
          On the first row, "yes" for x1_2 (decisionx1_2 = 1), therefore x1_3 (Phil) must been replaced by 0, etc.

          Can you please let me know what Stata commands I can use to run these? Again, thank you very much for helping me.
          Last edited by Rayinda Putri; 04 May 2022, 03:45.

          Comment


          • #6
            That sounds like a loop too, or perhaps a nested loop. I don't think I understand your rules well enough to get further involved.

            Comment


            • #7
              Dear Nick,
              I am going to give you an example of observations that I want to replace in my actual data. I've been trying to create commands to execute these:

              if a7x2round_1 == 1 then
              p2_2 = 1 if index_2 == 7
              p2_3 = 1 if index_3 == 7
              ...
              p2_60 = 1 if index_60 == 7


              if a8x2round_2 == 1 then
              p2_3 = 1 if index_3 == 8
              p2_4 = 1 if index_4 == 8
              ...
              p2_60 = 1 if index_60 == 8


              if a9x2round_5 == 1 then
              p2_6 = 1 if index_6 == 9
              p2_7 = 1 if index_7 == 9
              ...
              p2_60 = 1 if index_60 == 9

              Notice that there is a sequence pattern. For instance, when a9x2round_5 == 1 it needs some replacements for other variables, that is, p2_6, p2_7, p2_8, ..., p2_60 must be equal to 1. (5 --> 6, 7, 8, ..., 60)

              I have tried to run the following commands but they didn't work (for a9x2round_`i' only):
              Code:
              local i = 1
              if a9x2round_`i' == 1    {
                  local j = 2
                  if `j' > `i'        {
                  replace p2_`j' = 1 if index_`j' == 9
                  local j = `j'+1
              }
                  local i = `i'+1
              }
              Thank you very much in advance. I will really appreciate your help.
              Last edited by Rayinda Putri; 04 May 2022, 12:53.

              Comment

              Working...
              X