Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help requested: Problem with creating a new variable based on existent ones

    Hello all,
    This is the first time I post a question on this very helpful forum, so thanks in advance!

    I have a set of variables with procedures (from 1-10 mentioned as PR1, PR2, PR3...etc) and other variables with the day/timing of those procedures (from 1-10 mentioned as PR1-time, PR2-time, PR3-time...etc.). Each number in the procedure fields corresponds to the number in the day/time variable field.
    This stands true for each observation (each observation has x10 procedure variables and x10 procedure timing variables).
    If I want to create a new variable that identifies those who had procedure "X" (that can be in the variable field of PR3 in observation #1 while at PR7 in observation #8) that was done at day "Y" (that corresponds specifically with that observation's "X" procedure field, ie. PR3-time for observation #1 but PR7-time for observation #8...etc), what would be the code.

    Of note, the procedure variable is a string variable, procedure timing is a continuous variable

  • #2
    Can you give us an example of what your data looks like? See help dataex and the FAQ (black bar near the top of this page).

    The first thing people see of your post is its title. That determines if someone going to open that message. For example, I look for topics that I am good at, because those are the topics that I am most likely to do any good. You started your message with "help requested". That does not tell the reader much: if you asked a question on a forum like this then of course you request help. The fact that you added to your title, does not add any new information, and more importantly reduced the space for helpful information. So better leave that out next time.
    Last edited by Maarten Buis; 12 Dec 2022, 04:42.
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      I believe this is the data example. It is a confidential data so this is a fake dummy.
      The procedure name I am looking at is "X1" string variable

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input str3(Procedure1 Procedure2 Procedure3 Procedure4 Procedure5 Procedure6) str2 Procedure7 str3 Procedure8 byte(Procedure1Time Procedure2Time Procedure3Time Procedure4Time Procedure5Time Procedure6Time Procedure7Time Procedure8Time)
      "X1"  "X23" "B2"  "T8"  ""    ""    ""   ""    2 2 2 2 . . . .
      "T8"  "X1"  "D9"  "X23" "B2"  "D9"  "P0" ""    0 0 0 0 0 0 0 .
      "P0"  "S1"  "X1"  "T8"  "J56" "F76" "D9" "C1"  0 0 0 0 0 0 0 2
      "P0"  "D9"  ""    ""    ""    ""    ""   ""    0 2 . . . . . .
      "X23" "X1"  "P0"  "B2"  "T8"  "S1"  "D9" "J56" 1 0 2 1 1 1 1 1
      "D9"  "T8"  ""    ""    ""    ""    ""   ""    0 0 . . . . . .
      "T8"  "B2"  ""    ""    ""    ""    ""   ""    0 0 . . . . . .
      "X1"  "T8"  "X23" "D9"  "X1"  "P0"  "B2" "S1"  0 0 0 1 2 2 3 0
      "D9"  "X1"  "T8"  ""    ""    ""    ""   ""    0 0 0 . . . . .
      "T8"  "B2"  "D9"  "X23" "P0"  "X23" "X1" "S1"  0 0 0 0 0 0 0 0
      end

      Comment


      • #4
        Code:
        . clear
        
        . input str3(Procedure1 Procedure2 Procedure3 Procedure4 Procedure5 Procedure6) str2 Procedure7 str3 Procedure8 byte(Proced
        > ure1Time Procedure2Time Procedure3Time Procedure4Time Procedure5Time Procedure6Time Procedure7Time Procedure8Time)
        
             Procedu~1  Procedu~2  Procedu~3  Procedu~4  Procedu~5  Procedu~6  Procedu~7  Procedu~8  Pr~1Time  Pr~2Time  Pr~3Time  
        > Pr~4Time  Pr~5Time  Pr~6Time  Pr~7Time  Pr~8Time
          1. "X1"  "X23" "B2"  "T8"  ""    ""    ""   ""    2 2 2 2 . . . .
          2. "T8"  "X1"  "D9"  "X23" "B2"  "D9"  "P0" ""    0 0 0 0 0 0 0 .
          3. "P0"  "S1"  "X1"  "T8"  "J56" "F76" "D9" "C1"  0 0 0 0 0 0 0 2
          4. "P0"  "D9"  ""    ""    ""    ""    ""   ""    0 2 . . . . . .
          5. "X23" "X1"  "P0"  "B2"  "T8"  "S1"  "D9" "J56" 1 0 2 1 1 1 1 1
          6. "D9"  "T8"  ""    ""    ""    ""    ""   ""    0 0 . . . . . .
          7. "T8"  "B2"  ""    ""    ""    ""    ""   ""    0 0 . . . . . .
          8. "X1"  "T8"  "X23" "D9"  "X1"  "P0"  "B2" "S1"  0 0 0 1 2 2 3 0
          9. "D9"  "X1"  "T8"  ""    ""    ""    ""   ""    0 0 0 . . . . .
         10. "T8"  "B2"  "D9"  "X23" "P0"  "X23" "X1" "S1"  0 0 0 0 0 0 0 0
         11. end   
        
        .
        .
        . // everything is easier in long format
        . gen id = _n
        
        . reshape long Procedure Procedure@Time, i(id) j(j)
        (j = 1 2 3 4 5 6 7 8)
        
        Data                               Wide   ->   Long
        -----------------------------------------------------------------------------
        Number of observations               10   ->   80          
        Number of variables                  17   ->   4           
        j variable (8 values)                     ->   j
        xij variables:
           Procedure1 Procedure2 ... Procedure8   ->   Procedure
        Procedure1Time Procedure2Time ... Procedure8Time->ProcedureTime
        -----------------------------------------------------------------------------
        
        .
        . // create an indicator variable variable for having procedur X1 at time 2
        . gen x1at2 = (Procedure == "X1" & ProcedureTime == 2) if !missing(Procedure, ProcedureTime)
        (28 missing values generated)
        
        . list, sepby(id)
        
             +--------------------------------------+
             | id   j   Proce~re   Proce~me   x1at2 |
             |--------------------------------------|
          1. |  1   1         X1          2       1 |
          2. |  1   2        X23          2       0 |
          3. |  1   3         B2          2       0 |
          4. |  1   4         T8          2       0 |
          5. |  1   5                     .       . |
          6. |  1   6                     .       . |
          7. |  1   7                     .       . |
          8. |  1   8                     .       . |
             |--------------------------------------|
          9. |  2   1         T8          0       0 |
         10. |  2   2         X1          0       0 |
         11. |  2   3         D9          0       0 |
         12. |  2   4        X23          0       0 |
         13. |  2   5         B2          0       0 |
         14. |  2   6         D9          0       0 |
         15. |  2   7         P0          0       0 |
         16. |  2   8                     .       . |
             |--------------------------------------|
         17. |  3   1         P0          0       0 |
         18. |  3   2         S1          0       0 |
         19. |  3   3         X1          0       0 |
         20. |  3   4         T8          0       0 |
         21. |  3   5        J56          0       0 |
         22. |  3   6        F76          0       0 |
         23. |  3   7         D9          0       0 |
         24. |  3   8         C1          2       0 |
             |--------------------------------------|
         25. |  4   1         P0          0       0 |
         26. |  4   2         D9          2       0 |
         27. |  4   3                     .       . |
         28. |  4   4                     .       . |
         29. |  4   5                     .       . |
         30. |  4   6                     .       . |
         31. |  4   7                     .       . |
         32. |  4   8                     .       . |
             |--------------------------------------|
         33. |  5   1        X23          1       0 |
         34. |  5   2         X1          0       0 |
         35. |  5   3         P0          2       0 |
         36. |  5   4         B2          1       0 |
         37. |  5   5         T8          1       0 |
         38. |  5   6         S1          1       0 |
         39. |  5   7         D9          1       0 |
         40. |  5   8        J56          1       0 |
             |--------------------------------------|
         41. |  6   1         D9          0       0 |
         42. |  6   2         T8          0       0 |
         43. |  6   3                     .       . |
         44. |  6   4                     .       . |
         45. |  6   5                     .       . |
         46. |  6   6                     .       . |
         47. |  6   7                     .       . |
         48. |  6   8                     .       . |
             |--------------------------------------|
         49. |  7   1         T8          0       0 |
         50. |  7   2         B2          0       0 |
         51. |  7   3                     .       . |
         52. |  7   4                     .       . |
         53. |  7   5                     .       . |
         54. |  7   6                     .       . |
         55. |  7   7                     .       . |
         56. |  7   8                     .       . |
             |--------------------------------------|
         57. |  8   1         X1          0       0 |
         58. |  8   2         T8          0       0 |
         59. |  8   3        X23          0       0 |
         60. |  8   4         D9          1       0 |
         61. |  8   5         X1          2       1 |
         62. |  8   6         P0          2       0 |
         63. |  8   7         B2          3       0 |
         64. |  8   8         S1          0       0 |
             |--------------------------------------|
         65. |  9   1         D9          0       0 |
         66. |  9   2         X1          0       0 |
         67. |  9   3         T8          0       0 |
         68. |  9   4                     .       . |
         69. |  9   5                     .       . |
         70. |  9   6                     .       . |
         71. |  9   7                     .       . |
         72. |  9   8                     .       . |
             |--------------------------------------|
         73. | 10   1         T8          0       0 |
         74. | 10   2         B2          0       0 |
         75. | 10   3         D9          0       0 |
         76. | 10   4        X23          0       0 |
         77. | 10   5         P0          0       0 |
         78. | 10   6        X23          0       0 |
         79. | 10   7         X1          0       0 |
         80. | 10   8         S1          0       0 |
             +--------------------------------------+
        
        .
        . // create an indicator variable variable for having procedur X1 at any time
        . gen x1 = (Procedure == "X1") if !missing(Procedure, ProcedureTime)
        (28 missing values generated)
        
        . list, sepby(id)
        
             +-------------------------------------------+
             | id   j   Proce~re   Proce~me   x1at2   x1 |
             |-------------------------------------------|
          1. |  1   1         X1          2       1    1 |
          2. |  1   2        X23          2       0    0 |
          3. |  1   3         B2          2       0    0 |
          4. |  1   4         T8          2       0    0 |
          5. |  1   5                     .       .    . |
          6. |  1   6                     .       .    . |
          7. |  1   7                     .       .    . |
          8. |  1   8                     .       .    . |
             |-------------------------------------------|
          9. |  2   1         T8          0       0    0 |
         10. |  2   2         X1          0       0    1 |
         11. |  2   3         D9          0       0    0 |
         12. |  2   4        X23          0       0    0 |
         13. |  2   5         B2          0       0    0 |
         14. |  2   6         D9          0       0    0 |
         15. |  2   7         P0          0       0    0 |
         16. |  2   8                     .       .    . |
             |-------------------------------------------|
         17. |  3   1         P0          0       0    0 |
         18. |  3   2         S1          0       0    0 |
         19. |  3   3         X1          0       0    1 |
         20. |  3   4         T8          0       0    0 |
         21. |  3   5        J56          0       0    0 |
         22. |  3   6        F76          0       0    0 |
         23. |  3   7         D9          0       0    0 |
         24. |  3   8         C1          2       0    0 |
             |-------------------------------------------|
         25. |  4   1         P0          0       0    0 |
         26. |  4   2         D9          2       0    0 |
         27. |  4   3                     .       .    . |
         28. |  4   4                     .       .    . |
         29. |  4   5                     .       .    . |
         30. |  4   6                     .       .    . |
         31. |  4   7                     .       .    . |
         32. |  4   8                     .       .    . |
             |-------------------------------------------|
         33. |  5   1        X23          1       0    0 |
         34. |  5   2         X1          0       0    1 |
         35. |  5   3         P0          2       0    0 |
         36. |  5   4         B2          1       0    0 |
         37. |  5   5         T8          1       0    0 |
         38. |  5   6         S1          1       0    0 |
         39. |  5   7         D9          1       0    0 |
         40. |  5   8        J56          1       0    0 |
             |-------------------------------------------|
         41. |  6   1         D9          0       0    0 |
         42. |  6   2         T8          0       0    0 |
         43. |  6   3                     .       .    . |
         44. |  6   4                     .       .    . |
         45. |  6   5                     .       .    . |
         46. |  6   6                     .       .    . |
         47. |  6   7                     .       .    . |
         48. |  6   8                     .       .    . |
             |-------------------------------------------|
         49. |  7   1         T8          0       0    0 |
         50. |  7   2         B2          0       0    0 |
         51. |  7   3                     .       .    . |
         52. |  7   4                     .       .    . |
         53. |  7   5                     .       .    . |
         54. |  7   6                     .       .    . |
         55. |  7   7                     .       .    . |
         56. |  7   8                     .       .    . |
             |-------------------------------------------|
         57. |  8   1         X1          0       0    1 |
         58. |  8   2         T8          0       0    0 |
         59. |  8   3        X23          0       0    0 |
         60. |  8   4         D9          1       0    0 |
         61. |  8   5         X1          2       1    1 |
         62. |  8   6         P0          2       0    0 |
         63. |  8   7         B2          3       0    0 |
         64. |  8   8         S1          0       0    0 |
             |-------------------------------------------|
         65. |  9   1         D9          0       0    0 |
         66. |  9   2         X1          0       0    1 |
         67. |  9   3         T8          0       0    0 |
         68. |  9   4                     .       .    . |
         69. |  9   5                     .       .    . |
         70. |  9   6                     .       .    . |
         71. |  9   7                     .       .    . |
         72. |  9   8                     .       .    . |
             |-------------------------------------------|
         73. | 10   1         T8          0       0    0 |
         74. | 10   2         B2          0       0    0 |
         75. | 10   3         D9          0       0    0 |
         76. | 10   4        X23          0       0    0 |
         77. | 10   5         P0          0       0    0 |
         78. | 10   6        X23          0       0    0 |
         79. | 10   7         X1          0       0    1 |
         80. | 10   8         S1          0       0    0 |
             +-------------------------------------------+
        
        .
        . // If you just want one row per observation:
        . collapse (sum) x1at2 x1, by(id)
        
        . list
        
             +-----------------+
             | id   x1at2   x1 |
             |-----------------|
          1. |  1       1    1 |
          2. |  2       0    1 |
          3. |  3       0    1 |
          4. |  4       0    0 |
          5. |  5       0    1 |
             |-----------------|
          6. |  6       0    0 |
          7. |  7       0    0 |
          8. |  8       1    2 |
          9. |  9       0    1 |
         10. | 10       0    1 |
             +-----------------+
        ---------------------------------
        Maarten L. Buis
        University of Konstanz
        Department of history and sociology
        box 40
        78457 Konstanz
        Germany
        http://www.maartenbuis.nl
        ---------------------------------

        Comment


        • #5
          Thanks a lot Buis but I tried your code and it still didn't give me what I wanted.
          The list eventually, as shown above, gives a new variable called X1 and its new values which represent "how many X1 procedures were done in each observation".. Though what I wanted was: if X1 in observation 1 was at Procedure1 field, what is the time that it was done in (which is represented by Procedure1Time variable). Same goes for observation 2 which has X1 at procedure 2, what was the timing of it that was mentioned in Procedure2Time.

          Below is an example of how I want it to look like. I filled the X1Time variable manually, but of course, this will be tedious work to be done for millions of observations.

          Code:
          * Example generated by -dataex-. For more info, type help dataex
          clear
          input str3(Procedure1 Procedure2 Procedure3 Procedure4 Procedure5 Procedure6) str2 Procedure7 str3 Procedure8 byte(Procedure1Time Procedure2Time Procedure3Time Procedure4Time Procedure5Time Procedure6Time Procedure7Time Procedure8Time) float X1Time
          "X1"  "X23" "B2"  "T8"  ""    ""    ""   ""    2 2 2 2 . . . . 2
          "T8"  "X1"  "D9"  "X23" "B2"  "D9"  "P0" ""    0 0 0 0 0 0 0 . 0
          "P0"  "S1"  "X1"  "T8"  "J56" "F76" "D9" "C1"  0 0 0 0 0 0 0 2 0
          "P0"  "D9"  ""    ""    ""    ""    ""   ""    0 2 . . . . . . .
          "X23" "X1"  "P0"  "B2"  "T8"  "S1"  "D9" "J56" 1 0 2 1 1 1 1 1 0
          "D9"  "T8"  ""    ""    ""    ""    ""   ""    0 0 . . . . . . .
          "T8"  "B2"  ""    ""    ""    ""    ""   ""    0 0 . . . . . . .
          "S1"  "T8"  "X23" "D9"  "X1"  "P0"  "B2" "S1"  0 0 0 1 2 2 3 0 2
          "D9"  "X1"  "T8"  ""    ""    ""    ""   ""    0 0 0 . . . . . 0
          "T8"  "B2"  "D9"  "X23" "P0"  "X23" "X1" "S1"  0 0 0 0 0 0 0 0 0
          end

          Comment


          • #6
            What should happen when X1 appears multiple times for a single observation?
            ---------------------------------
            Maarten L. Buis
            University of Konstanz
            Department of history and sociology
            box 40
            78457 Konstanz
            Germany
            http://www.maartenbuis.nl
            ---------------------------------

            Comment


            • #7
              I believe that will pose an issue and hence I deleted the duplicates in my new example. In my actual data, I was thinking about the deletion of duplicates and taking it from there. It could be a limitation, but it will ease doing it for a million observations in one go

              Comment


              • #8
                But which one do you delete? Random? That is a horrible idea
                ---------------------------------
                Maarten L. Buis
                University of Konstanz
                Department of history and sociology
                box 40
                78457 Konstanz
                Germany
                http://www.maartenbuis.nl
                ---------------------------------

                Comment


                • #9
                  No, I am deleting the whole observation. The idea is I am looking at a certain procedure that is done in a hospital, and usually is done once. If done twice during the same admission, that is an exclusion criterion to my study be default cuz either the first was a failed attempt or a complicated case. Either way, my deleted observations from the whole sample were less than 0.01%. Still, the real issue for me is figuring out a code that helps sort out all the timings in one go of the remaining observations

                  Comment

                  Working...
                  X