Dear Statalist members,
I'm using Stata 14. I have a string variable that contains adresses in the form of "Neighborhood Municipality".
I need to extract the municipality from the string.
The problem is that both the name of the municipality and the neighborhood may be composed by more than one word and there is no character separating them so this may be a little complicated.
Data looks something like this:
I have the list of municipalities so I'm using it to identify the municipality name in each string.
So far I've started identifying municipalities with a one-word name. Now I want to move on to municipalities with two-word names and so on.
So basically what I think I need is to be able to check the last word of the string, then the two last words and so on and see if they match to the list of municipalities I have.
I've tried using the regex functions but I still have problems using it. Any ideas?
Thanks in advance!
I'm using Stata 14. I have a string variable that contains adresses in the form of "Neighborhood Municipality".
I need to extract the municipality from the string.
The problem is that both the name of the municipality and the neighborhood may be composed by more than one word and there is no character separating them so this may be a little complicated.
Data looks something like this:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str56 origin "neighborhoodA municipalityA" "neighborhoodBA neighborhoodBB municipalityA" "neighborhoodA municipalityBA municipalityBB" "neighborhoodBA neighborhoodBB municipalityBA municipalityBB" end
I have the list of municipalities so I'm using it to identify the municipality name in each string.
So far I've started identifying municipalities with a one-word name. Now I want to move on to municipalities with two-word names and so on.
So basically what I think I need is to be able to check the last word of the string, then the two last words and so on and see if they match to the list of municipalities I have.
I've tried using the regex functions but I still have problems using it. Any ideas?
Thanks in advance!
Comment