merging 1:1 using unique identifier for multiple admissions

Muhammad Rashid

Join Date: Aug 2018

Posts: 38
#1

merging 1:1 using unique identifier for multiple admissions

06 Jun 2020, 13:36

Hi All

I am looking to merge two datasets. Master dataset has hospital admission information of patients - all can be identified using unique patient id (PatID). the using dataset has procedure information of the patients which again all can be identified by using the same PatID. However, as patients can appear multiple times during the year in both datasets, the observations are not uniqely identified by PatID anymore. I have tried a 1:1 merge using PatID year and month which gives me approx 80% match across both datasets. I am missing some patients as some patients in master dataset admitted at the end of the month may appear in the procedure dataset in the following month. For example. Patient 1 admitted on 29th January in master dataset had a procedure on the 3rd of Feb in the using dataset.
I am interested to keep all observations as I want to study the patients who may be admitted multiple times during the years and have multiple procedures, Hence dont want to remove all duplicates
Any help Please?
below is my command and my data example
clear
input str11 PatID float(year month)

"65203" 2017 9
"72071" 2019 2
"46611" 2018 9
"48441" 2017 11
"44841" 2017 9
"47041" 2018 1
"45093" 2019 7
"46401" 2017 7
"47046" 2018 3
"45022" 2019 3
"45022" 2019 9

Sort PatID year month
merge 1:1 PatID year month using "test.dta"
Hi All

I am looking to merge two datasets. Master dataset has hospital admission information of patients - all can be identified using unique patient id (PatID). the using dataset has procedure information of the patients which again all can be identified by using the same PatID. However, as patients can appear multiple times during the year in both datasets, the observations are not uniqely identified by PatID anymore. I have tried a 1:1 merge using PatID year and month which gives me approx 80% match across both datasets. I am missing some patients as some patients in master dataset admitted at the end of the month may appear in the procedure dataset in the following month. For example. Patient 1 admitted on 29th January in master dataset had a procedure on the 3rd of Feb in the using dataset.
I am interested to keep all observations as I want to study the patients who may be admitted multiple times during the years and have multiple procedures, Hence dont want to remove all duplicates
Any help Please?
below is my command and my data example
clear
input str11 PatID float(year month)

"65203" 2017 9
"72071" 2019 2
"46611" 2018 9
"48441" 2017 11
"44841" 2017 9
"47041" 2018 1
"45093" 2019 7
"46401" 2017 7
"47046" 2018 3
"45022" 2019 3
"45022" 2019 9

Sort PatID year month
merge 1:1 PatID year month using "test.dta"
Tags: None
Rich Goldstein

Join Date: Mar 2014

Posts: 4546
#2

06 Jun 2020, 19:35

I am not completely clear on what you want but first merge, with the master in memory, using something like -merge 1:m ... -; if you, for some reason, need to then get down to one observation per id, you probably want the -collapse- command; see

Code:

help merge help collapse
Comment
Muhammad Rashid

Join Date: Aug 2018

Posts: 38
#3

08 Jun 2020, 13:12

I cannot use collapse as I am interested in merging repeated admissions across both datasets. For example, Person A appears in the master dataset twice in the master data in year 2019 and also appears twice in second data twice in 2019. I want to link both episodes across both datasets in time order.
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4546
#4

08 Jun 2020, 13:24

then use merge 1:m ... as I suggested in #2 and do not -collapse-; merge 1:1 will NOT do what you want
Comment

Announcement

merging 1:1 using unique identifier for multiple admissions

Comment

Comment

Comment