Hi,
I have to calculated the incidence over one year for a disease and for that i have twodatabases (a labotary database and a hospital database). For example, some people have made one test in a labotary database and later an other test in hospital database and the results of those two test may be differents (one positive and the other negative). I merge those twodatabase and then I use this command to identify the duplicates
"sort NAME YEAR
quietly by NAME YEAR: gen dup = cond(_N==1,0,_n)
tab dup"
If for one duplicated, one of the RESULT is negative and the other is positive, I would like to delete the duplicated data for wich RESULT is coded negative and keep the data where it's coded positive. For the duplicated where the RESULT are the same (negative and negative or positive and positive) I want to just delete one of the duplicated, without any condition.
Do someone know how can I code this ?
I have to calculated the incidence over one year for a disease and for that i have twodatabases (a labotary database and a hospital database). For example, some people have made one test in a labotary database and later an other test in hospital database and the results of those two test may be differents (one positive and the other negative). I merge those twodatabase and then I use this command to identify the duplicates
"sort NAME YEAR
quietly by NAME YEAR: gen dup = cond(_N==1,0,_n)
tab dup"
If for one duplicated, one of the RESULT is negative and the other is positive, I would like to delete the duplicated data for wich RESULT is coded negative and keep the data where it's coded positive. For the duplicated where the RESULT are the same (negative and negative or positive and positive) I want to just delete one of the duplicated, without any condition.
Do someone know how can I code this ?
Comment