Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Multiple lines of observations for one identification number

    I have a dataset (example attached) in which the same identification number occurs on multiple lines and has multiple observations. In the example I have three important variables- LOS, culture, and organism. For this example data set patient number 23 started in the emergency department and then was eventually admitted, making them count as an "admitted patient". I need to count how many admitted patients had a blood or sputum culture performed. Unfortunately, patient number 23 had the cultures performed in the emergency department and thus the positive value for "culture" does not occur on the same observation line as LOS (indicating that they were admitted). Is there any way to collapse the variables in my dataset and only keep the ones that at some point had an positive admission as well as a blood or sputum culture?
    Attached Files

  • #2
    Hello,

    You could use:

    Code:
    sort ID LOS
    quielty by ID: gen dup=cond(_N==1,0,_n)
    and after drop the dpulicates that you want.

    Comment


    • #3
      I am currently using STATA 13 and the "quietly" command is unrecognized. Is there another option to try?

      Comment


      • #4
        Here are a couple of ways:

        Code:
        egen ntests = total(trim(Culture) =="Blood" | trim(Culture) == "Sputum"), by(ID)
        egen ntests2 = total(trim(Culture) != "-"), by(ID)
        You can keep cases where ntests > 0.

        Note that the coding of Culture is inconsistent in your example, e.g. you have both "Blood" and "Blood ". That is why I used the trim function. But if there are other inconsistencies (e.g. "blood" "BLOOD") you have to code for them. Perhaps my 2nd coding will always work and be safer. Or maybe you want to use numeric codes for culture, so you don't have to worry about different spellings.
        -------------------------------------------
        Richard Williams, Notre Dame Dept of Sociology
        Stata Version: 17.0 MP (2 processor)

        EMAIL: [email protected]
        WWW: https://www3.nd.edu/~rwilliam

        Comment


        • #5
          The first one worked well, thank you! I did forget to mention there are other types of tests (urine etc) that are also options for the "culture" variable, however I am only interested in blood and sputum, is there any way to get rid of these observations and only look at patients with blood and sputum cultures at any point during the hospital stay?

          Comment


          • #6
            Following Richard's example:

            Code:
            keep if ntests > 0
            That will keep the observations where ntests has a positive number. The observations for patients who didn't have any blood or sputum cultures will be dropped.
            Alfonso Sanchez-Penalver

            Comment


            • #7
              Just modify the code to

              Code:
              egen ntests = total(trim(Culture) =="Blood" | trim(Culture) == "Sputum"), by(ID)
              keep if ntests > 0
              Again, watch out for spelling inconsistencies for Blood and Sputum.

              Incidentally your example is very helpful. It would be even better if it included records for a case that should be dropped, so you could test to make sure such cases do get excluded.

              Another variation:

              Code:
              gen hadtest = trim(Culture) =="Blood" | trim(Culture) == "Sputum"
              egen ntests = total(hadtest), by(ID)
              keep if ntests > 0
              Listing your data,

              Code:
              . list
              
                   +-------------------------------------------------------------------------------------+
                   | ID    Date          Service      Unit   LOS   Culture   Organism   hadtest   ntests |
                   |-------------------------------------------------------------------------------------|
                1. | 23   1-Jan   Emergency Dept         -     -     Blood         SP         1        3 |
                2. | 23   1-Jan   Emergency Dept         -     -    Blood          SP         1        3 |
                3. | 23   1-Jan   Emergency Dept         -     -    Sputum         SP         1        3 |
                4. | 23   1-Jan   Emergency Dept         -     -         -          -         0        3 |
                5. | 23   1-Jan   Emergency Dept         -     -         -          -         0        3 |
                   |-------------------------------------------------------------------------------------|
                6. | 23   1-Jan         Hospital   General     1         -          -         0        3 |
                7. | 23   1-Jan         Hospital   General     1         -          -         0        3 |
                   +-------------------------------------------------------------------------------------+
              Either way should work, but this might make it easier for you to make sure that individual records are getting coded correctly. I wouldn't drop any cases before I was sure that my coding was working correctly, e.g. don't accidentally drop "blood".
              Last edited by Richard Williams; 30 Jul 2014, 08:07.
              -------------------------------------------
              Richard Williams, Notre Dame Dept of Sociology
              Stata Version: 17.0 MP (2 processor)

              EMAIL: [email protected]
              WWW: https://www3.nd.edu/~rwilliam

              Comment


              • #8
                Great- one final question. For visits that occurred on the same date, is there any way to make a "visit number" variable? For example although a patient might have six lines of observations they would have just one visit, and then maybe came back three months later and had another visit? Within that "visit" variable I need to determine who had a blood or sputum culture and who at any point was admitted. I foresee three variables "visit number", "culture type", and "admission" (yes/no) for each unique identification number.

                Comment


                • #9
                  Great- one final question. For visits that occurred on the same date, is there any way to make a "visit number" variable? For example although a patient might have six lines of observations they would have just one visit, and then maybe came back three months later and had another visit? Within that "visit" variable I need to determine who had a blood or sputum culture and who at any point was admitted. I foresee three variables "visit number", "culture type", and "admission" (yes/no) for each unique identification number.

                  Comment


                  • #10
                    What counts as "admitted?" LOS = 1? Or Service = "Hospital"? Or Unit = "General" ? Explain how to code admitted for an individual record.
                    -------------------------------------------
                    Richard Williams, Notre Dame Dept of Sociology
                    Stata Version: 17.0 MP (2 processor)

                    EMAIL: [email protected]
                    WWW: https://www3.nd.edu/~rwilliam

                    Comment


                    • #11
                      Originally posted by MeganD View Post
                      I am currently using STATA 13 and the "quietly" command is unrecognized. Is there another option to try?
                      MeganD, the original code of Pierre Elias had a typo where the word quietly was misspelled. It would be very strange if Stata refused to perform this command, since it is built-in:

                      Code:
                      . which qui
                      built-in command:  quietly
                      If you are spelling it correctly and Stata refuses to perform it, use the Installation Qualification Tool from StataCorp to see if there are any problems with your installation.

                      For your question also see if standard duplicates command can be useful.

                      But I'd follow your original description of what needs to be done quite literally.

                      I need to count how many admitted patients ....
                      So create a dataset of IDs of admitted patients in the first step (by dropping non-admitted patients and saving to a tempfile).

                      ...patients had a blood or sputum culture performed.
                      So keep in the original file only the test results you are interested in. Then merge with the result of admission, drop all non-merged (not admitted), and you should arrive to your destination.

                      Yes, this method involves tempfile, merges etc. But it can be easier to explain, debug, and it follows your thinking. Plus you may need the list of admitted patients for other tasks anyway.

                      Best, Sergiy Radyakin.

                      Comment


                      • #12
                        Am LOS (length of stay) >= 1 constitutes as admitted

                        Comment


                        • #13
                          I can't make a file of only admitted patients because often the observation row that denotes a patient as being admitted (LOS>=1) does not contain information about the type of culture they received. Basically I need to know if any row pertaining to the same identification of an admitted person number at any point references a blood or sputum culture most likely in a previous row. The total () function is not working.

                          Comment


                          • #14
                            Disregard that last post. I now have values for hadtest and ntest, but I still need to figure out how to find the people who "hadtest" were admitted?

                            Comment


                            • #15
                              I think this works but for God's sake test it. Your use of string variables makes things a little clunky and increases the potential for coding errors.

                              Code:
                              * Preserve original order of records
                              gen nrec = _n
                              label variable nrec "Original record # before any sorting"
                              * Generate vars that will ID records with desired characteristics
                              * Make sure these codes are correct!!!
                              gen hadtest = trim(Culture) =="Blood" | trim(Culture) == "Sputum"
                              generate LOSnumber = real(LOS)
                              gen admit = LOSnumber >= 1 & !missing(LOSnumber)
                              label variable hadtest "Dir R have a blood or sputum test?"
                              label variable admit "Was R admitted?"
                              * Generate visit Number. Coding is clunky but functional
                              egen visit = group(ID Date)
                              tempvar xvisit
                              bysort ID (Date): gen `xvisit' = visit[1]
                              bysort ID (Date): replace visit = visit - `xvisit' + 1
                              label variable visit "Visit #"
                              * Add to every record for a case indicators of 
                              * whether any of the tests of interest were done
                              * and if the respondent was admitted
                              egen ntests = total(hadtest), by(ID visit)
                              label variable ntests "Number of tests R had on this visit"
                              egen admitted = max(admit), by(ID visit)
                              label variable admitted "Was R admitted on this visit?"
                              sort ID visit Date
                              -------------------------------------------
                              Richard Williams, Notre Dame Dept of Sociology
                              Stata Version: 17.0 MP (2 processor)

                              EMAIL: [email protected]
                              WWW: https://www3.nd.edu/~rwilliam

                              Comment

                              Working...
                              X