Hello, Stata members again,
Kindly I used the below code for lab data for two scenarios as you can see (step1 & step2), the problem is in the last line when I merged them there was no matching even though I used the same codes except for reoccurence coding.
For your assistance please
Kindly I used the below code for lab data for two scenarios as you can see (step1 & step2), the problem is in the last line when I merged them there was no matching even though I used the same codes except for reoccurence coding.
Code:
// Step 1: Clean Data clear cd "/Users/meshalalqhtani/Dropbox/Annual Epi report_Updated/Dengue/new data extraction_2023" import excel "lab_2023_DHF.xlsx", sheet("Sheet1") firstrow case(lower) // Manage duplicates and text duplicates report candidatenid rename candidatenid id duplicates tag id, gen(dup) tab dup replace testresult = lower(testresult) replace teststatus = lower(teststatus) replace testname = lower(testname) // Parse and filter dates split resultdate, parse(,) limit(3) gen r_date=date(resultdate1,"YMD") format %td r_date keep if r_date > dmy(01,01,2023) & r_date < dmy(31,12,2023) keep if dup == 0 // Identify "dengue" split testname, p(,) split testresult, p(,) rename (testname testresult) orig_= reshape long testname testresult, i(id) j(testnum) drop if testname == "" replace testname = lower(trim(testname)) replace testresult = lower(trim(testresult)) gen byte is_dengue = inlist(testname, "dengue igm", "dengue ns1", "dengue pcr") gen byte is_positive = inlist(testresult, "positive", "detected") gen byte is_dengue_positive = is_dengue * is_positive egen wanted2 = max(is_dengue_positive), by(id) drop testname testresult is_* testnum duplicates drop rename orig_* * // Keep dengue records and save keep if wanted2 == 1 save "dengue_test.dta", replace // Step 2: Flagging Reoccurring Dengue Cases clear import excel "lab_2023_DHF.xlsx", sheet("Sheet1") firstrow case(lower) // Manage text and dates replace testresult = lower(testresult) replace teststatus = lower(teststatus) replace testname = lower(testname) split resultdate, parse(,) limit(3) gen r_date=date(resultdate1,"YMD") format %td r_date keep if r_date > dmy(01,01,2023) & r_date < dmy(31,12,2023) rename r_date date_stata ren candidatenid id sort id date_stata // Calculate date differences by id: gen lag_date = date_stata[_n-1] gen date_diff = date_stata - lag_date gen flag_reoccurrence = (date_diff >= 14 & !missing(date_diff)) by id: egen id_flag_reoccurrence = max(flag_reoccurrence) // Keep flagged reoccurrences keep if flag_reoccurrence == 1 bys id: gen int seq = _n tab seq keep if seq == 1 // Identify "dengue" again split testname, p(,) split testresult, p(,) rename (testname testresult) orig_= reshape long testname testresult, i(id) j(testnum) drop if testname == "" replace testname = lower(trim(testname)) replace testresult = lower(trim(testresult)) gen byte is_dengue = inlist(testname, "dengue igm", "dengue ns1", "dengue pcr") gen byte is_positive = inlist(testresult, "positive", "detected") gen byte is_dengue_positive = is_dengue * is_positive egen wanted2 = max(is_dengue_positive), by(id) drop testname testresult is_* testnum duplicates drop rename orig_* * // Keep dengue records and merge with cleaned data keep if wanted2 == 1 merge m:1 id using "dengue_test.dta" // Check merge results tab _merge
Comment