Dear Statalist,
Below is an email I sent to Michael Blasnik, author of -reclink- inquiring about a problem I am having. Would you please share any advice you might have regarding my issue?
Thanks,
Dom
I am working on a project at the Minnesota Population Center which is attempting to link various people to their information in the 1940 U.S. census. I am attempting to do this using -reclink-. I have first and last names, parents' first and last names, as well as state of birth, which I am trying to match on.
I am using match weights, non-match weights (for some reason I cannot use both weights at the same time), I am or-blocking on the first letter of last name, I am and-blocking (using the -required- option) on the state of birth.
The strange thing though is that people are getting linked to completely wrong observations. Even thought I or-block on first letter of last name, people with different first letters of last names are getting linked, similarly for and-blocking on state of birth. In fact, everyone is getting linked to a person in the State of Alabama, which is the first state sorted in the census (using dataset). I am guessing that the algorith goes line by line in the using dataset, and since the census has millions of observations it is stopping right in Alabama.
Any advice on what I could do to obtain successful matches?
Below is an email I sent to Michael Blasnik, author of -reclink- inquiring about a problem I am having. Would you please share any advice you might have regarding my issue?
Thanks,
Dom
I am working on a project at the Minnesota Population Center which is attempting to link various people to their information in the 1940 U.S. census. I am attempting to do this using -reclink-. I have first and last names, parents' first and last names, as well as state of birth, which I am trying to match on.
I am using match weights, non-match weights (for some reason I cannot use both weights at the same time), I am or-blocking on the first letter of last name, I am and-blocking (using the -required- option) on the state of birth.
The strange thing though is that people are getting linked to completely wrong observations. Even thought I or-block on first letter of last name, people with different first letters of last names are getting linked, similarly for and-blocking on state of birth. In fact, everyone is getting linked to a person in the State of Alabama, which is the first state sorted in the census (using dataset). I am guessing that the algorith goes line by line in the using dataset, and since the census has millions of observations it is stopping right in Alabama.
Any advice on what I could do to obtain successful matches?
Comment