Hi,
My question relates to maximum macro length with levelsof. I hope that it is not redundant- I went through the previous posts related to levelsof but could not find a solution that addresses the issue I'm facing. I have a dataset containing 8,000 firm names. Some firm names include a location. For example:
For each firm name, I want to extract the location and store it into a variable called city. To do this, I downloaded a separate dataset from geonames.org that lists all city names worldwide, and ran the following code:
The code works as intended when only considering cities with more than 40,000 inhabitants. However when considering all cities, I got the following error: macro substitution results in line that is too long r(920). This makes sense, as -help levelsof- warns: "levelsof may hit the limits imposed by Stata. However, it is typically used when the number of distinct values of varname is modest." How can I work around this constraint in my particular case? I tried Method 1 suggested by Nick Cox in the FAQ https://www.stata.com/support/faqs/data-management/try-all-values-with-foreach/index.html but I am not sure how it would apply here considering that the list of cities I want to extract is in a separate dataset.
Any suggestion is welcome, many thanks for the help!
My question relates to maximum macro length with levelsof. I hope that it is not redundant- I went through the previous posts related to levelsof but could not find a solution that addresses the issue I'm facing. I have a dataset containing 8,000 firm names. Some firm names include a location. For example:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str165 comp_extracted "Abbontiakoon Mines " "Aberdeen Commercial " "Aberdeen Gas " "Aberdeen Steam " "Aberdeen Leith and Clyde Steam Shipping " "Abford Estates" "Abosso " "Aboukir Company " "Agricultural Hall " "Alexandra (Newport) Dock " "Alliance and Dublin Consumers Gas" "Alliance and Dublin Consumers Gas Consolidated Ordinary " end
Code:
clear all "$user/Firm_names", replace // Store a list of cities in a macro using levelsof preserve clear all use "/$user/company_names/geonames-all-cities-with-a-population-1000.dta", clear keep if Population > 40000 levelsof City, local(cityname_list_temp) restore // Extract cities from firm names gen city="" foreach keyword of local cityname_list_temp{ local pattern "\(?\b`keyword'\b\)?" quietly replace city = ustrregexs(0) if ustrregexm(comp_extracted, "`pattern'") }
Any suggestion is welcome, many thanks for the help!
Comment