Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Data manupulation




    ----------------------- copy starting from the next line -----------------------
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte(PatientID Visit Day) str18 EQ5D_D byte(EQ5D_L F G H)
    1 2 3 "Anxiety/depression" 3 . . .
    1 2 3 "Mobility"           1 . . .
    1 2 3 "Pain/discomfort"    2 . . .
    1 2 3 "Selfcare"           2 . . .
    1 2 3 "Usual activities"   1 . . .
    2 4 5 "Anxiety/depression" 2 . . .
    2 4 5 "Mobility"           3 . . .
    2 4 5 "Pain/discomfort"    1 . . .
    2 4 5 "Selfcare"           3 . . .
    2 4 5 "Usual activities"   3 . . .
    3 6 7 "Anxiety/depression" 2 . . .
    3 6 7 "Mobility"           2 . . .
    3 6 7 "Pain/discomfort"    2 . . .
    3 6 7 "Selfcare"           1 . . .
    3 6 7 "Usual activities"   1 . . .
    . . . ""                   . . . .
    . . . ""                   . . . .
    . . . ""                   . . . .
    . . . ""                   . . . .
    . . . ""                   . . . .
    end
    Hello, I need help getting this data above to be in a format where each line represents one patient. I would like it to look like this

    Patient ID Visit Day Anxiety/depression Mobility Pain/discomfort Selfcare Usual activities
    1 2 3 3 1 2 2 1
    2 4 5 2 3 1 3 3


    Any help would be appreciated. thank you

  • #2
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte(PatientID Visit Day) str18 EQ5D_D byte EQ5D_L
    1 2 3 "Anxiety/depression" 3
    1 2 3 "Mobility"           1
    1 2 3 "Pain/discomfort"    2
    1 2 3 "Selfcare"           2
    1 2 3 "Usual activities"   1
    2 4 5 "Anxiety/depression" 2
    2 4 5 "Mobility"           3
    2 4 5 "Pain/discomfort"    1
    2 4 5 "Selfcare"           3
    2 4 5 "Usual activities"   3
    3 6 7 "Anxiety/depression" 2
    3 6 7 "Mobility"           2
    3 6 7 "Pain/discomfort"    2
    3 6 7 "Selfcare"           1
    3 6 7 "Usual activities"   1
    end
    
    replace EQ5D_D= strtoname(EQ5D_D)
    reshape wide EQ5D_L,  i(PatientID Visit Day) j(EQ5D_D) string
    rename EQ5D_L* *
    Res.:

    Code:
    . l
    
         +-------------------------------------------------------------------------------+
         | Patien~D   Visit   Day   Anxiet~n   Mobility   Pain_d~t   Selfcare   Usual_~s |
         |-------------------------------------------------------------------------------|
      1. |        1       2     3          3          1          2          2          1 |
      2. |        2       4     5          2          3          1          3          3 |
      3. |        3       6     7          2          2          2          1          1 |
         +-------------------------------------------------------------------------------+
    
    
    .
    Last edited by Andrew Musau; 17 Oct 2023, 08:55.

    Comment


    • #3
      Andrew Musau Thanks for your reply. I have tried using the code and it doesn't work for me. I also don't have any variable called EQ5D_L*

      Can you please help

      Comment


      • #4
        Andrew Musau's code works fine with your data example. Note that EQ5D_L* is not the name of a single variable, but a wildcard standing for a bundle of variable names starting with EQ5D_L -- which do exist after the reshape.
        Last edited by Nick Cox; 17 Oct 2023, 09:53.

        Comment


        • #5
          Nick Cox thanks for the explanation. It works now - no sure what happened there. thank you

          Comment


          • #6
            Andrew Musau The code works fine. I'm not sure what happened there, perhaps I copied it wrongly. thanks for helping.

            Comment


            • #7
              Originally posted by Grace Mongo View Post
              Nick Cox thanks for the explanation. It works now - no sure what happened there. thank you
              I don't know if this is the reason it didn't work at first, but if you used Andrew Musau's code on the example dataset you first provided, then it's because that dataset has missing data for EQ5D_D in a few records. EQ5D_D is your "j" (subobservation) variable in the reshape wide command, and you can't have missing values there. Just an FYI based on my many, many mistakes made with reshape.

              Comment


              • #8
                I had actually copied the data wrongly, here is the right data. I don't think I can have one row per patient as there are multiple visits per patient. Can you help please.

                input byte(PatientID Visit Day) str18 EQ5D_D byte EQ5D_L
                1 2 3 "Anxiety_depression" 3
                1 2 3 "Mobility" 1
                1 2 3 "Pain_discomfort" 2
                1 2 3 "Selfcare" 2
                1 2 3 "Usual_activities" 1
                1 4 5 "Anxiety_depression" 2
                1 4 5 "Mobility" 3
                1 4 5 "Pain_discomfort" 1
                1 4 5 "Selfcare" 3
                1 4 5 "Usual_activities" 3
                3 6 7 "Anxiety_depression" 2
                3 6 7 "Mobility" 2
                3 6 7 "Pain_discomfort" 2
                3 6 7 "Selfcare" 1
                3 6 7 "Usual_activities" 1
                end

                I would like the data in this format

                PatientID Visit Day Anxiety_depression Mobility Pain_discomfort Selfcare Usual_activities
                1 2 3 3 1 2 2 1
                1 4 5 2 3 1 3 3
                3 6 7 2 2 2 1 1

                Comment


                • #9
                  You get the output in #8 with the code in #2.

                  Code:
                  clear
                  input byte(PatientID Visit Day) str18 EQ5D_D byte EQ5D_L
                  1 2 3 "Anxiety_depression" 3
                  1 2 3 "Mobility" 1
                  1 2 3 "Pain_discomfort" 2
                  1 2 3 "Selfcare" 2
                  1 2 3 "Usual_activities" 1
                  1 4 5 "Anxiety_depression" 2
                  1 4 5 "Mobility" 3
                  1 4 5 "Pain_discomfort" 1
                  1 4 5 "Selfcare" 3
                  1 4 5 "Usual_activities" 3
                  3 6 7 "Anxiety_depression" 2
                  3 6 7 "Mobility" 2
                  3 6 7 "Pain_discomfort" 2
                  3 6 7 "Selfcare" 1
                  3 6 7 "Usual_activities" 1
                  end
                  
                  reshape wide EQ5D_L,  i(PatientID Visit Day) j(EQ5D_D) string
                  rename EQ5D_L* *
                  Res.:

                  Code:
                  . l
                  
                       +-------------------------------------------------------------------------------+
                       | Patien~D   Visit   Day   Anxiet~n   Mobility   Pain_d~t   Selfcare   Usual_~s |
                       |-------------------------------------------------------------------------------|
                    1. |        1       2     3          3          1          2          2          1 |
                    2. |        1       4     5          2          3          1          3          3 |
                    3. |        3       6     7          2          2          2          1          1 |
                       +-------------------------------------------------------------------------------+
                  Res.:

                  Comment


                  • #10
                    This is still not working for me. I get an error code saying that values of EQ5D_D not unique within PatientID Visit Day. I am not sure how to proceed.
                    Last edited by Grace Mongo; 19 Oct 2023, 13:09.

                    Comment


                    • #11
                      It could be an issue of duplicates (highlighted in red below) or multiple observations of EQ5D_L per PatientID Visit Day EQ5D_D combination (highlighted in blue below). Duplicates are uninformative, so you can delete them - unless they do really represent two separate measurements. With multiple observations, you can generate a variable which differentiates these (named "which" below).

                      Code:
                      clear
                      input byte(PatientID Visit Day) str18 EQ5D_D byte EQ5D_L
                      1 2 3 "Anxiety_depression" 3
                      1 2 3 "Mobility" 1
                      1 2 3 "Pain_discomfort" 2
                      1 2 3 "Pain_discomfort" 2
                      1 2 3 "Selfcare" 2
                      1 2 3 "Usual_activities" 1
                      1 4 5 "Anxiety_depression" 2
                      1 4 5 "Mobility" 3
                      1 4 5 "Pain_discomfort" 1
                      1 4 5 "Pain_discomfort" 2
                      1 4 5 "Selfcare" 3
                      1 4 5 "Usual_activities" 3
                      3 6 7 "Anxiety_depression" 2
                      3 6 7 "Mobility" 2
                      3 6 7 "Pain_discomfort" 2
                      3 6 7 "Selfcare" 1
                      3 6 7 "Usual_activities" 1
                      end
                      
                      *DELETE THE LINE BELOW IF DUPLICATE OBSERVATIONS ARE INFORMATIVE
                      duplicates drop *, force
                      bys PatientID Visit Day EQ5D_D (EQ5D_L): gen which=_n
                      reshape wide EQ5D_L,  i(PatientID Visit Day which) j(EQ5D_D) string
                      rename EQ5D_L* *
                      Res.:

                      Code:
                      . l
                      
                           +---------------------------------------------------------------------------------------+
                           | Patien~D   Visit   Day   which   Anxiet~n   Mobility   Pain_d~t   Selfcare   Usual_~s |
                           |---------------------------------------------------------------------------------------|
                        1. |        1       2     3       1          3          1          2          2          1 |
                        2. |        1       4     5       1          2          3          1          3          3 |
                        3. |        1       4     5       2          .          .          2          .          . |
                        4. |        3       6     7       1          2          2          2          1          1 |
                           +---------------------------------------------------------------------------------------+
                      
                      .

                      Comment


                      • #12
                        when I run the code it says 0 observations are duplicates and n not found. I am stuck

                        Comment


                        • #13
                          _n not n

                          bys PatientID Visit Day EQ5D_D (EQ5D_L): gen which=_n

                          Comment


                          • #14
                            Yay, it worked. Many thanks

                            Comment

                            Working...
                            X