Hi all- first post. I'm using Stata 15.1 on a Mac. I'm working with health claims data with the goal to produce a patient's multi-morbidity count using 1 or 2 generated variables. I've attached toy data below to help.
claim = claim number
id = patient id
dx1 - dx4 = medical diagnosis
For each patient id, I want Stata to scan variables dx1 dx2 dx3 dx4 across all of the patient's claims and return a count of the number of specific diagnoses that appear at least once.
For example, let's say diagnoses A, B, and C are of interest. Patient 1 has been given an A, B, and C diagnosis at least once, so I would want the command to return the value of "3" for patient 1. Patient 2 has been given the diagnosis of A and B at least once, but not C, so I would want the command to return the value of "2".
I've looked through a number of egen possibilities, including tag, but none of them seem to help. Any suggestions?
claim = claim number
id = patient id
dx1 - dx4 = medical diagnosis
For each patient id, I want Stata to scan variables dx1 dx2 dx3 dx4 across all of the patient's claims and return a count of the number of specific diagnoses that appear at least once.
For example, let's say diagnoses A, B, and C are of interest. Patient 1 has been given an A, B, and C diagnosis at least once, so I would want the command to return the value of "3" for patient 1. Patient 2 has been given the diagnosis of A and B at least once, but not C, so I would want the command to return the value of "2".
I've looked through a number of egen possibilities, including tag, but none of them seem to help. Any suggestions?
Code:
clear input str2 claim float id str1(dx1 dx2 dx3 dx4) "1" 1 "C" "T" "E" "B" "2" 1 "C" "U" "E" "B" "3" 1 "C" "F" "E" "" "4" 1 "B" "A" "E" "" "5" 1 "B" "A" "" "" "6" 1 "C" "E" "" "" "7" 1 "D" "" "" "" "8" 1 "D" "" "" "" "9" 2 "D" "B" "T" "E" "10" 2 "D" "B" "T" "E" "11" 2 "D" "T" "E" "" "12" 2 "A" "T" "" "" "13" 2 "B" "T" "" "" "14" 2 "B" "" "" "" "15" 2 "B" "" "" "" end
Comment