Hi all, is there a way to merge / match by the ID variable and the closest date?
I have data from two forms - an antenatal and last desk form. Participants in this study have multiple visits, and normally complete the antenatal form and the last desk form on the same day, making it easy to merge by the id variable plus the visit date. (The ID variable is rida on the antenatal form and ridl on the last desk form- they are the same for a given person). However, sometimes the last desk form is completed on a later date and when I try to merge the two forms together, the data for these visits ends up on two lines (i.e. does not merge) - see the final four lines in the example below for two people who completed their antenatal and last desk forms completed on different days. My goal is to have all the variables for that visit date combo appear in one line (not two).
visdta = visit date for antenatal form; visdtl = visit date for the last desk form
* Example generated by -dataex-. To install: ssc install dataex
clear
input long rida int visdta double bpavgsysa long ridl int visdtl byte highbpdl
910384 22064 102.5 910384 22064 2
910384 22078 91.5 910384 22078 2
910385 21994 97.5 910385 21994 2
910385 22025 98.5 910385 22025 2
910385 22053 115.5 910385 22053 2
910385 22078 106 910385 22078 2
910386 21997 103.5 910386 21997 2
910386 22025 105 910386 22025 2
910386 22054 109.5 910386 22054 2
910386 22083 105.5 910386 22083 2
910384 22013 98.5 . . .
. . . 910384 22017 2
910385 22119 105.5 . . .
. . . 910385 22124 2
end
format %tddd-Mon-YY visdta
format %tddd-Mon-YY visdtl
[/CODE]
I tried the anymatch command (after copying the ridl value over to the rida slot and the visdtl over to the visdta slot):
replace ride = ridl if ride==.
replace visdta = visdtl if visdta==.
anymatch visdta, id(rida) metric(visdta) near(1)
but got the error "observations not uniquely identified by "rida" r(198);
Any ideas for how to do this, either with anymatch or another command?
Thanks,
~Cristina
I have data from two forms - an antenatal and last desk form. Participants in this study have multiple visits, and normally complete the antenatal form and the last desk form on the same day, making it easy to merge by the id variable plus the visit date. (The ID variable is rida on the antenatal form and ridl on the last desk form- they are the same for a given person). However, sometimes the last desk form is completed on a later date and when I try to merge the two forms together, the data for these visits ends up on two lines (i.e. does not merge) - see the final four lines in the example below for two people who completed their antenatal and last desk forms completed on different days. My goal is to have all the variables for that visit date combo appear in one line (not two).
visdta = visit date for antenatal form; visdtl = visit date for the last desk form
* Example generated by -dataex-. To install: ssc install dataex
clear
input long rida int visdta double bpavgsysa long ridl int visdtl byte highbpdl
910384 22064 102.5 910384 22064 2
910384 22078 91.5 910384 22078 2
910385 21994 97.5 910385 21994 2
910385 22025 98.5 910385 22025 2
910385 22053 115.5 910385 22053 2
910385 22078 106 910385 22078 2
910386 21997 103.5 910386 21997 2
910386 22025 105 910386 22025 2
910386 22054 109.5 910386 22054 2
910386 22083 105.5 910386 22083 2
910384 22013 98.5 . . .
. . . 910384 22017 2
910385 22119 105.5 . . .
. . . 910385 22124 2
end
format %tddd-Mon-YY visdta
format %tddd-Mon-YY visdtl
[/CODE]
I tried the anymatch command (after copying the ridl value over to the rida slot and the visdtl over to the visdta slot):
replace ride = ridl if ride==.
replace visdta = visdtl if visdta==.
anymatch visdta, id(rida) metric(visdta) near(1)
but got the error "observations not uniquely identified by "rida" r(198);
Any ideas for how to do this, either with anymatch or another command?
Thanks,
~Cristina
Comment