I am combining multiple administrative data sets and would be happy to receive input on a sensible approach. I cannot give a data example as I am using sensitive data, but I made up two data sets to illustrate the issue.
Say we have two individual-level data sets with data registered monthly:

There are
Merge 1:1, 1:m, and m:1 all return
Merging using ID YM will result in the same error and also make no sense as we would merge dates on prescriptions with dates for emergency room contacts.
From Stata documentation and posts -merge m:m- does not seem advisable. I am currently looking at -joinby- as an alternative to -merge-, but I am not certain whether this is correct either.
Say we have two individual-level data sets with data registered monthly:
- Prescription drugs with values A, B, C. n = 10000.
- Emergency room contacts with contact reasons. n = 2000.
There are
- Multiple entries for ID with possibly multiple entries in the same month.
- Some ID's may be registered in the prescription data base but not the emergency room data base, and vice versa.
Merge 1:1, 1:m, and m:1 all return
Code:
variable ID does not uniquely identify observations in the using data r(459)
From Stata documentation and posts -merge m:m- does not seem advisable. I am currently looking at -joinby- as an alternative to -merge-, but I am not certain whether this is correct either.
Comment