in a foreach loop, I am searching for values within string-variables, using the strmatch() and astrisk (*) wildcards. I need the astrisks because I'm searching for words that fall into any part of the string.
these string variables are nested into local macros. However using * in the foreach does not work with stata IF it is part of a nested/descendant macro. Is this because:
Note: I'm working with a large dataset so the strmatch() command without the foreach loop, is not an option/solution, unless there is an alternative to foreach().
Here's an example, for drug class Q (parent/ancestor macro), with individual drug lists (decendant macro):
(successfully stored information)
Here's an example of what I would want the foreach loop for find, searching within the string variable 'chemical'
For example searching on "A*B" within parent macro drugclassQ would find
drugs with string values within the string variable 'chemical' as the following:
(note: mg = miligrams to illustrate my point about needing to define the variable as a string since the drugs are entered into the database in different ways)
Example Output to identify strings with A and B anywhere within values of 'Chemical',
where 0 means the observations doesn't fit the search,
so I don't keep that observation.
My code works when I do not use astricks, but then that defeats the premise of how I'm using the foreach code,
i.e., using foreach with wildcard that is within nested macros.
Any solutions?
Thanks!
these string variables are nested into local macros. However using * in the foreach does not work with stata IF it is part of a nested/descendant macro. Is this because:
- A) wildcards within strings can never be used in foreach in stata when using nested macros, or
B) it isn't the wildcard itself, but the * (astrisk) that is producing the error in foreach?
Note: I'm working with a large dataset so the strmatch() command without the foreach loop, is not an option/solution, unless there is an alternative to foreach().
Here's an example, for drug class Q (parent/ancestor macro), with individual drug lists (decendant macro):
Code:
*chem term list local drug_list1 " "A*B" "B*A" "A" " local drug_list2 " "C*D" "D" " *search term list local drugclassQ " "drug_list1" "drug_list2" " *check macro data successfully stored di `drugclassQ'
Code:
*Search all drug terms in descriptions foreach searchterm in "drugclassQ" { gen byte `searchterm' = 0 di "Making column called `searchterm'" foreach chemterm in ``searchterm'' { di "Checking individual search terms in: `chemterm'" foreach indiv in ``chemterm'' { di "Searching all columns for *`indiv'*" foreach codeterm in lower(variable) { di "`searchterm': Checking `codeterm' column for *`indiv'*" replace `searchterm' = 1 if strmatch(`codeterm', "*`indiv'*") } } } } gen keep_term = . replace keep_term=1 if drugclassQ==1 keep if keep_term==1
For example searching on "A*B" within parent macro drugclassQ would find
drugs with string values within the string variable 'chemical' as the following:
Code:
Amg / Fmg /B A/B A/ B/R Amg/dose / Emg/dose / Bmg/dose
(note: mg = miligrams to illustrate my point about needing to define the variable as a string since the drugs are entered into the database in different ways)
Example Output to identify strings with A and B anywhere within values of 'Chemical',
where 0 means the observations doesn't fit the search,
so I don't keep that observation.
1 | Amg / Fmg /B | 1 |
2 | A/B | 1 |
3 | A/ B/R | 1 |
4 | Amg/dose / Emg/dose / Bmg/dose | 1 |
5 | A | 0 |
i.e., using foreach with wildcard that is within nested macros.
Any solutions?
Thanks!
Comment