Hi,
Could I possibly have some help identifying observations that lie closest in time to another observation for the same individual?
I have a large dataset in long format. Within this I have repeated observations for individuals (individual identified by "patid"). One observation for each individual refers to the date of an MI. The others are all kidney function readings on different dates. What I would like to do, is to a) identify the most recent kidney function reading for each individual, prior to the date of their MI, and b) to determine how many days prior to the MI this reading was taken.
I was wondering whether this might be better in wide format, but some individuals have >150 kidney function readings so STATA rather gives up on this idea.
(Apologies if this has been discussed elsewhere - I don't know what command(s) I'm looking for, so rather hard to search for advice on it!)
Dataex example:
input double patid float(dup obsdate currentMI CKD)
2920018 1 19652 1 0
2920018 2 22102 0 0
14820018 1 19428 0 1
14820018 2 19444 0 1
14820018 3 19514 1 0
14820018 4 19523 0 1
14820018 5 19795 0 1
14820018 6 19795 0 0
14820018 7 19816 0 1
14820018 8 19844 0 1
14820018 9 19845 0 1
14820018 10 20166 0 1
14820018 11 20255 0 1
14820018 12 20527 0 1
14820018 13 20769 0 1
14820018 14 20886 0 1
14820018 15 20958 0 1
14820018 16 21083 0 1
14820018 17 21111 0 1
14820018 18 21175 0 1
14820018 19 21256 0 1
14820018 20 21532 0 1
14820018 21 21672 0 1
14820018 22 22042 0 1
14820018 23 22043 0 1
14820018 24 22069 0 1
For example, in the dataex above, the first patid has only two observation dates (one for the date of the MI and the other for the date of the kidney function reading (CKD)), whereas the 2nd observation has 23 kidney function readings.
Thanks in advance for any help received.
JEmima
Could I possibly have some help identifying observations that lie closest in time to another observation for the same individual?
I have a large dataset in long format. Within this I have repeated observations for individuals (individual identified by "patid"). One observation for each individual refers to the date of an MI. The others are all kidney function readings on different dates. What I would like to do, is to a) identify the most recent kidney function reading for each individual, prior to the date of their MI, and b) to determine how many days prior to the MI this reading was taken.
I was wondering whether this might be better in wide format, but some individuals have >150 kidney function readings so STATA rather gives up on this idea.
(Apologies if this has been discussed elsewhere - I don't know what command(s) I'm looking for, so rather hard to search for advice on it!)
Dataex example:
input double patid float(dup obsdate currentMI CKD)
2920018 1 19652 1 0
2920018 2 22102 0 0
14820018 1 19428 0 1
14820018 2 19444 0 1
14820018 3 19514 1 0
14820018 4 19523 0 1
14820018 5 19795 0 1
14820018 6 19795 0 0
14820018 7 19816 0 1
14820018 8 19844 0 1
14820018 9 19845 0 1
14820018 10 20166 0 1
14820018 11 20255 0 1
14820018 12 20527 0 1
14820018 13 20769 0 1
14820018 14 20886 0 1
14820018 15 20958 0 1
14820018 16 21083 0 1
14820018 17 21111 0 1
14820018 18 21175 0 1
14820018 19 21256 0 1
14820018 20 21532 0 1
14820018 21 21672 0 1
14820018 22 22042 0 1
14820018 23 22043 0 1
14820018 24 22069 0 1
For example, in the dataex above, the first patid has only two observation dates (one for the date of the MI and the other for the date of the kidney function reading (CKD)), whereas the 2nd observation has 23 kidney function readings.
Thanks in advance for any help received.
JEmima
Comment