Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How do i drop all dates, except the earliest date for specific ID variables

    I have a dataset where patients have specific ID-numbers. They also have one specific ID for every time they have been admitted to the hospital, which means that there are several of the same ID's but with different dates. I only need the first admission, so i need to figure out how i drop all variables with the same ID except the ID-number with the first admission. I hope that makes sense and that you can help. Thank you.

    Example:
    ID: 0000012354 Admissiondate: 04 mar 13
    ID: 0000012354 Admissiondate: 07 jun 14
    ID: 0000012354 Admissiondate: 12 jan 16

    All the best
    Guest
    Last edited by sladmin; 28 Jan 2019, 15:08. Reason: anonymize original poster

  • #2
    When you say variables, do you mean observations? At a guess that would be

    Code:
    bysort ID (Admissiondate) : keep if _n == 1
    but that's destructive, so make sure you have a copy of the data saved before you try that. I am assuming your date variable is a Stata numeric date variable, which isn't explicit.

    Comment


    • #3
      Spot on! Thank you so much sir

      Comment


      • #4
        I have another question that i hope you can help me with sir.
        Now i have a dataset with almost 3 million patientadmissions over a timespan of several years. Every ID again has several observations, because every patient may have had several admissions or registrations at various departments, where in some deparment-registrations they have the diagnosis that i am interested in and in some they do not. Some patients do not have any registered diagnosis at all, but i still need them to contribute to the analysis, as not having the outcome. One patient ID is though again appearing many times because of registration a various departments. I therefore need to drop all variables for an ID, except one (the earliest), if they have no diagnosis, and except the variable(s) with the diagnosis, if the patient have a diagnosis or several diagnosis.

        Example:
        ID: 0000012354 Diagnosis Code: D862 Admissiondate: 04 mar 13
        ID: 0000012354 Diagnosis Code: . Admissiondate: 07 jun 14
        ID: 0000012354 Diagnosis Code: . Admissiondate: 08 aug 14
        ID: 0000012354 Diagnosis Code: C425 Admissiondate: 21 dec 14
        ID: 0000043567 Diagnosis Code: . Admissiondate: 03 jan 13
        ID: 0000043567 Diagnosis Code: G700 Admissiondate: 16 dec 14
        ID: 0000043567 Diagnosis Code: G700 Admissiondate: 21 dec 14
        ID: 0000093231 Diagnosis Code: C243 Admissiondate: 01 may 16
        ID: 0000074333 Diagnosis Code: . Admissiondate: 01 may 16
        ID: 0000074333 Diagnosis Code: . Admissiondate: 01 may 17
        ID: 0000074333 Diagnosis Code: . Admissiondate: 01 may 18
        ID: 0000074333 Diagnosis Code: . Admissiondate: 07 may 18

        Hope you can help again, thank you

        Comment


        • #5
          I recommend to start a new thread on this topic.

          Please make sure you have read the FAQ and applied the recommendations on how to share data.
          Best regards,

          Marcos

          Comment

          Working...
          X