Hello,
I have a dataset containg hospital billing data. 257000 observations to be exact. I have a variable called, ID. This variable uniquely identifies patients. Each row is a billed charge so there are mutliple rows per ID. I have another variable called, Department. This variable contains a string value denoting the type of service the patient received and was charged for.
Example data:
ID Department
1 MRI
1 PT/OT
1 Surgery
2 Office
2 PTOT
2 MRI
3 PTOT
3 PTOT
4 Surgery
5 MRI
5 PTOT
5 PTOT
I need to subset my dataset to all patients that have received an MRI and analyze the other types of services (denoted in Department) that they have received. I would like to keep all same IDs if one observation is Department=="MRI". So, for example, In the example dataset, I would keep all of the observatrions for IDs: 1, 2, & 5. This would include keeping the observations where ID 1=="PT/OT", as example.
My elementary approach has been:
gen tagger="1" if Department=="MRI"
sort ID
replace tagger="1" if ID==ID[_n-1] & tag=="1"[_n-1]
I usually repeat this until all IDs have been tagged. I know this is not the most efficient way to do this. Please advise.
I have a dataset containg hospital billing data. 257000 observations to be exact. I have a variable called, ID. This variable uniquely identifies patients. Each row is a billed charge so there are mutliple rows per ID. I have another variable called, Department. This variable contains a string value denoting the type of service the patient received and was charged for.
Example data:
ID Department
1 MRI
1 PT/OT
1 Surgery
2 Office
2 PTOT
2 MRI
3 PTOT
3 PTOT
4 Surgery
5 MRI
5 PTOT
5 PTOT
I need to subset my dataset to all patients that have received an MRI and analyze the other types of services (denoted in Department) that they have received. I would like to keep all same IDs if one observation is Department=="MRI". So, for example, In the example dataset, I would keep all of the observatrions for IDs: 1, 2, & 5. This would include keeping the observations where ID 1=="PT/OT", as example.
My elementary approach has been:
gen tagger="1" if Department=="MRI"
sort ID
replace tagger="1" if ID==ID[_n-1] & tag=="1"[_n-1]
I usually repeat this until all IDs have been tagged. I know this is not the most efficient way to do this. Please advise.
Comment