Hi everyone
I have a dataset which contains 25 icd-10 diagnostic codes per observation. Some people have multiple visits, thus id appears multiple times.
I want to first count the number of unique icd codes per patient but somehow my tag double counts codes which appear more than once for the same id.
Secondly I need to sum each icd code (condition e.g. diabetes) for the whole same. Please see below sample of data
. dataex
----------------------- copy starting from the next line -----------------------
------------------ copy up to and including the previous line ------------------
Listed 12 out of 12 observations
I have tried a few examples from the help file but still did not get any to work. Please advice
gen long id = _n
keep id Dx* SubjectId
reshape long Dx, i(id) j(_j)
egen diabetes = max(Dx == "E119"), by(id)
Also tried
gen long order = _n
by Dx1 (order), sort: gen uniquecodes = _n==1
list SubjectId Dx1 _n uniquecodes
sort order
replace uniquecodes = sum (uniquecodes)
the second option does count uniquecodes correctly and doesn't double count within the same id. But once I use a loop to apply the same method to D1-Dx5 counts are wrong.
Your assistance will be highly appreciated.
I have a dataset which contains 25 icd-10 diagnostic codes per observation. Some people have multiple visits, thus id appears multiple times.
I want to first count the number of unique icd codes per patient but somehow my tag double counts codes which appear more than once for the same id.
Secondly I need to sum each icd code (condition e.g. diabetes) for the whole same. Please see below sample of data
. dataex
----------------------- copy starting from the next line -----------------------
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input double SubjectId str7(Dx1 Dx2 Dx3 Dx4) str5 Dx5 100 "K528" "E119" " " " " " " 101 "K228" "K20" "J960" " " "I251" 101 "I251" "I10" "E780" "E119" "N998" 101 "J960" "J440" "E112" "I10" "E780" 101 "K228" "K20" "K259" "B968" " " 101 "B964" "E119" "E039" " " " " 101 "J440" "B956" "B964" "T814" "D649" 101 "T857" "I251" "E119" "I10" "E780" 103 "G410" "F29" "I698" "M625" " " 104 "C73" "C770" "J441" " " " " 105 "J154" " " " " " " " " 105 "J188" " " " " "J188" " " end
Listed 12 out of 12 observations
I have tried a few examples from the help file but still did not get any to work. Please advice
gen long id = _n
keep id Dx* SubjectId
reshape long Dx, i(id) j(_j)
egen diabetes = max(Dx == "E119"), by(id)
Also tried
gen long order = _n
by Dx1 (order), sort: gen uniquecodes = _n==1
list SubjectId Dx1 _n uniquecodes
sort order
replace uniquecodes = sum (uniquecodes)
the second option does count uniquecodes correctly and doesn't double count within the same id. But once I use a loop to apply the same method to D1-Dx5 counts are wrong.
Your assistance will be highly appreciated.
Comment