Dear Stata Experts,
I have lab data, which consists of two variables (test name and test result), and the types of the two variables are string and text, and they are separated by a comma (,). Also, they are randomly written with no specific sequences, such as for test name (al khurma pcr,chikungunya pcr,dengue igg,dengue igm,dengue ns1,dengue pcr,rift valley fever PCR), and for test results(not required,not required,not done,positive,not done,detected,not required)
I am interested in three test names (dengue igm, dengue ns1, dengue PCR) and their test results if they are positive or detected.
Therefore I used the below Command:
replace testresult = lower(testresult ) // convert everything to lowercaes to be safe
replace testname = lower(testname)
gen check = ustrregexm(testresult, "positive| detected")
gen new_test = ustrregexm(testname , "pcr| ns1 |igm")
******************
gen text = ustrregexs(0) if ustrregexm(testname, "dengue igm| dengue ns1|dengue pcr")
the problem is I can't locate each test name (dengue igm| dengue ns1|dengue PCR) with their test results because they are randomly located in the text sequence in the variable test name.
I hope I have explained my issue very clearly, for your assistance, please!.
Meshal
I have lab data, which consists of two variables (test name and test result), and the types of the two variables are string and text, and they are separated by a comma (,). Also, they are randomly written with no specific sequences, such as for test name (al khurma pcr,chikungunya pcr,dengue igg,dengue igm,dengue ns1,dengue pcr,rift valley fever PCR), and for test results(not required,not required,not done,positive,not done,detected,not required)
I am interested in three test names (dengue igm, dengue ns1, dengue PCR) and their test results if they are positive or detected.
Therefore I used the below Command:
replace testresult = lower(testresult ) // convert everything to lowercaes to be safe
replace testname = lower(testname)
gen check = ustrregexm(testresult, "positive| detected")
gen new_test = ustrregexm(testname , "pcr| ns1 |igm")
******************
gen text = ustrregexs(0) if ustrregexm(testname, "dengue igm| dengue ns1|dengue pcr")
the problem is I can't locate each test name (dengue igm| dengue ns1|dengue PCR) with their test results because they are randomly located in the text sequence in the variable test name.
I hope I have explained my issue very clearly, for your assistance, please!.
Meshal
Comment