Hello All,
I have, what I assume to be, a simple problem - but I cannot seem to figure out the problem or find helpful answers in the forum.
I am trying to match a cohort of individuals who were arrested to a file of individuals who were prosecuted. There is a standard number that should identify case (status_no), but it can often be duplicated. The best way to match is on status_no & last name, but as these are two different data sets, the spellings of last name often differ, so a simple merge is not perfect.
I am trying to use the required() function with reclink in order to match on status_no and have a fuzzy match on last name.
reclink lastname status_no using "prosecution.dta", idmaster(id_arrest) idusing(id_case) required(status_no) _merge(testmergge) gen(testoutput) minscore(.85)
240578 perfect matches found
Added: id= identifier from prosecution.dta testoutput = matching score
Observations: Master N = 241113 prosectuion.dta N= 243817
Unique Master Cases: matched = 241113 (exact = 240578), unmatched = 0)
However, when I run the check, I compare status_no in the master to Ustatus_no and they are completely different. I am clearly misunderstanding something in the relation to the required function, any help would be much appreciated!
And NB: I did try using status_no both as a string and as a numeric variable.
Thank you!
I have, what I assume to be, a simple problem - but I cannot seem to figure out the problem or find helpful answers in the forum.
I am trying to match a cohort of individuals who were arrested to a file of individuals who were prosecuted. There is a standard number that should identify case (status_no), but it can often be duplicated. The best way to match is on status_no & last name, but as these are two different data sets, the spellings of last name often differ, so a simple merge is not perfect.
I am trying to use the required() function with reclink in order to match on status_no and have a fuzzy match on last name.
reclink lastname status_no using "prosecution.dta", idmaster(id_arrest) idusing(id_case) required(status_no) _merge(testmergge) gen(testoutput) minscore(.85)
240578 perfect matches found
Added: id= identifier from prosecution.dta testoutput = matching score
Observations: Master N = 241113 prosectuion.dta N= 243817
Unique Master Cases: matched = 241113 (exact = 240578), unmatched = 0)
However, when I run the check, I compare status_no in the master to Ustatus_no and they are completely different. I am clearly misunderstanding something in the relation to the required function, any help would be much appreciated!
And NB: I did try using status_no both as a string and as a numeric variable.
Thank you!

Comment