Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Put two variables next to each other

    Hello everyone,


    I have a problem in that I cant seem to put two variables from the same date next to each other.

    Short background; I had a variable (labcode), where the outcomes 1744 and 1740 corresponded with Systolic and Diastolic bloodpressures. I coded these outcomes into new variables, but they dont appear next to each other (on the same date), see picture for example.


    I would like to do so, so later I can keep only one outcome (the latest, based on date).

    I also have to admit that I am very new to STATA and I am learning through trial and error. Any help would be appreciated and please be patient with me. Thank you so much in advance !



    Click image for larger version

Name:	example.png
Views:	1
Size:	28.9 KB
ID:	1666958

  • #2
    What you want to do is referred to as reshaping wide. See

    Code:
    help reshape
    The following is not tested due to lack of a usable data example.

    Code:
    drop SysBD DiasBD
    reshape wide waarde2, i(Pat_id labdatum) j(labcode)
    rename (waarde21744 waarde21740) (SysBD DiasBD)

    For your future posts, read FAQ Advice #12 on how to provide data examples using the dataex command. Also, see our strong preference for full real names. Click on “Contact us” at the bottom right-hand corner of the page and request that your name be changed.
    Last edited by Andrew Musau; 30 May 2022, 15:09.

    Comment


    • #3
      From your data example, collapse (firstnm) is likely to work, but see its help.

      Comment


      • #4
        Dear Andrew and Nick,

        Thanks for your help! I really appreciate it.


        Andrew Musau ; Thanks for the information, I have tried to read through some information and rules about posting, but it is my first post, so my apologies. And I will try to contact for name change (However, my first name is correct).


        About your code; I tried your code but STATA gives the following error;


        ". reshape wide waarde, i(patient_ozps labdatum) j(labcode)
        (note: j = 1740 1744)

        values of variable labcode not unique within patient_ozps labdatum
        Your data are currently long. You are performing a reshape wide. You
        specified i(patient_ozps labdatum) and j(labcode). There are
        observations within i(patient_ozps labdatum) with the same value of
        j(labcode). In the long data, variables i() and j() together must
        uniquely identify the observations.

        long wide
        +---------------+ +------------------+
        | i j a b | | i a1 a2 b1 b2 |
        |---------------| <--- reshape ---> |------------------|
        | 1 1 1 2 | | 1 1 3 2 4 |
        | 1 2 3 4 | | 2 5 7 6 8 |
        | 2 1 5 6 | +------------------+
        | 2 2 7 8 |
        +---------------+
        Type reshape error for a list of the problem variables.

        r(9); "

        I dont understand this error, as the values for Labcode are indeed unique within patient_ozps (formerly set as Patient_ID). It may be possible that there are some missing values, but I thought STATA automatically replaces this with a dot (.) ? Is it possible that the different values for "labdatum" gives this error? Thanks again for your help !


        @Nick ; I Dont understand your suggestion, as I think the collapse command is to convert my data to median, means, max and minimums? Or am I missing something?

        Comment


        • #5
          The error message implies that you have duplicates of the variables patient_ozps, labdatum and labcode. Perhaps the same patient got two or more BP readings on the same day.

          Code:
          bys patient_ozps  labdatum labcode: gen tag=_N>=2
          list if tag, sepby(patient_ozps  labdatum labcode)
          If that is the case, you need to create a third variable that identifies which reading it is within the day. If on the other hand these are duplicated values and contain no additional information, you can


          Code:
          duplicates drop patient_ozps  labdatum labcode, force
          before proceeding to the code in #2. For Nick's suggestion, given your data structure in #1:

          Code:
          collapse (firstnm) SysBD DiasBD, by(patient_ozps  labdatum)
          See the -firstnm- (first nonmissing) option of collapse in

          Code:
          help collapse

          Comment


          • #6
            The hint was in (firstnm) in collapse (firstnm) -- as Andrew Musau helpfully explained.

            The modesty here does you credit, but calling yourself Dim member may create an expectation that you will ask a slightly dopey question. Use a real name and no one will really notice. Slightly dopey questions are what every learner -- which means all of us -- may have.

            I have been using Stata for >30 years and still cultivate large areas of ignorance about major parts of the software.

            Comment


            • #7
              You are right Andrew, thank you so much. I thought the values were already means/averages of multiple measurements on the same day, but as you can see, this is not always the case.

              The next step would be to take the average of these multiple measurement on the same day and have only one measurement.

              Comment


              • #8
                Originally posted by Nick Cox View Post
                The hint was in (firstnm) in collapse (firstnm) -- as Andrew Musau helpfully explained.

                The modesty here does you credit, but calling yourself Dim member may create an expectation that you will ask a slightly dopey question. Use a real name and no one will really notice. Slightly dopey questions are what every learner -- which means all of us -- may have.

                I have been using Stata for >30 years and still cultivate large areas of ignorance about major parts of the software.
                Thank you Nick, I really appreciate your and Andrews help a lot! It was not my intention to create confusion with my username (Dim is my real first name, member of course not), and I already contacted moderators for a change .

                I will now read about the collapse function, I was confused as I used the collapse code in the past for something else and like I said before I am very very new to STATA. It also doesnt help that I dont have a background in coding (but healthcare) so I still learn with trial and error. I hope you and Andrew know how gratefull i am for your help and certainly am not trying to waste time of both of you.

                Comment


                • #9
                  Originally posted by Dim member View Post
                  The next step would be to take the average of these multiple measurement on the same day and have only one measurement.
                  So

                  Code:
                  drop SysBD DiasBD
                  collapse waarde2, by(patient_ozps labdatum labcode)
                  reshape wide waarde2, i(patient_ozps labdatum) j(labcode)
                  rename (waarde21744 waarde21740) (SysBD DiasBD)

                  Comment


                  • #10
                    #8 OK -- in addition, using your full name may make it clear that your first language is not English, which can be helpful in various subtle ways.

                    Comment


                    • #11
                      Hey Andrew and Nick,

                      I got a lot further with the help of both of you. I cannot express how gratefull I am!

                      Now that I have my variables defined as SysBD and DiasBD and put them under the rightfull observation date, I can keep the most recent observation. Thanks to another topic here, I have found the right code to do so.

                      I did that, by doing the following;

                      Code:
                       drop if SysBD==.
                      
                      drop if DiasBD==.
                      To delete if STATA created missing values if one of SysBD or DiasBD was not measured on a specific date and for extra check (and i was right, about 500 observations were actually deleted).

                      Code:
                      by patient_ozps (labdatum), sort: keep if _n == _N
                      By previously deleting missing values, I now have the latest values that are complete.


                      It feels so good when the code is working and doing what is needed and once again, I can not express how gratefull I am to you gentleman. My next adventure will begin tomorrow, as I try to find out how to merge this dataset with another one I have been working on!

                      Comment

                      Working...
                      X