Hello everyone,
I have a fairly large dataset (~35000 observations) that is arranged as such:
My question is, if I want to keep only the EARLIEST date for every patient and drop all later dates, how can I go about doing that efficiently?
Thank you,
MQR
I have a fairly large dataset (~35000 observations) that is arranged as such:
Patient ID | Date | Variable X |
1 | 1/12/2000 00:00:00 | 100 |
1 | 1/10/1992 00:00:00 | 120 |
2 | 2/15/2003 00:00:00 | 98 |
2 | 1/8/2005 00:00:00 | 79 |
My question is, if I want to keep only the EARLIEST date for every patient and drop all later dates, how can I go about doing that efficiently?
Thank you,
MQR
Comment