Dear Stata users,
I have a database about patents. In this database the individual’s address is written in the following format:
address1, address2, address3 and address4 (e.g., Euclid’s Road 15258; P. O. Box 15269; Boston, MA; MA). Information on address3 and address4 may appear with missing values.
What I want to create is a variable called state. This variable will take the name of the state given information from address2, address3 and address4.
What I have in mind is that I need to tell Stata to generate the variable given specific words found in entries of address2, address3 and address4. For the sake of an example: fill state variable with the name Massachusetts if the words Boston or MA (or both) are found at least once in address2, address3 and address4. Since the entries are not the same (i.e., Boston, MA != Boston != MA), a simple if statement does not seem to be the right way to attack this problem.
Can we somehow fix this in Stata?
Thank you for your time!
I have a database about patents. In this database the individual’s address is written in the following format:
address1, address2, address3 and address4 (e.g., Euclid’s Road 15258; P. O. Box 15269; Boston, MA; MA). Information on address3 and address4 may appear with missing values.
What I want to create is a variable called state. This variable will take the name of the state given information from address2, address3 and address4.
What I have in mind is that I need to tell Stata to generate the variable given specific words found in entries of address2, address3 and address4. For the sake of an example: fill state variable with the name Massachusetts if the words Boston or MA (or both) are found at least once in address2, address3 and address4. Since the entries are not the same (i.e., Boston, MA != Boston != MA), a simple if statement does not seem to be the right way to attack this problem.
Can we somehow fix this in Stata?
Thank you for your time!
Comment