Hi. My dataset comprises of all string variables. The variable Event has mentions of countries along with other information for each observation. I want to create a new variable called Country that parses out from Event all countries mentioned in the local "country". A few things about the variable Event: poland; romania; spain"
1. A single observation may carry more than one country. for e.g poland; romania; spain
2. The word "outside" is common to many country entries. For example: worlwide outside japan, worldwide outside australasia and s e asia. If there a way for me to parse out an entire phrase that carries "outside" and is separated with a ; from the next phrase. For e.g in "outside the us and japan; allergy", I want the variable Country to only carry "outside the us and japan"
Any help on this will be very much appreciated. Thank you
1. A single observation may carry more than one country. for e.g poland; romania; spain
2. The word "outside" is common to many country entries. For example: worlwide outside japan, worldwide outside australasia and s e asia. If there a way for me to parse out an entire phrase that carries "outside" and is separated with a ; from the next phrase. For e.g in "outside the us and japan; allergy", I want the variable Country to only carry "outside the us and japan"
Any help on this will be very much appreciated. Thank you
Code:
use "$datadir\Country_reshaped.dta", clear gen country=lower(CountryName) levelsof country, local(country) `"argentina"' `"australia"' `"austria"' `"belgium"' `"brazil"' `"canada"' `"chile"' `"china"' `"colombia"' `"denmark"' `"finland" > ' `"france"' `"germany"' `"greece"' `"hong kong"' `"india"' `"ireland"' `"israel"' `"italy"' `"japan"' `"luxembourg"' `"malaysi > a"' `"mexico"' `"netherlands"' `"new zealand"' `"norway"' `"peru"' `"philippines"' `"portugal"' `"russian federation"' `"south > africa"' `"south korea"' `"spain"' `"sweden"' `"switzerland"' `"thailand"' `"turkey"' `"uk"' `"usa"' `"venezuela"' clear use "$datadir\KeyEvents_reshaped.dta", clear gen event=lower(EventDetails) clear input str60 event "preclinical" "the us; analgesic, other" "pain, neuropathic" "new" "worldwide" "poland; romania; spain" "canada and russia; conjunctivitis, allergic" "nas; germany" "amersham health" "the us, 20030731" "the us, 20030630" "the eu, follicular lymphoma"
Comment