Hi statalist,
I have a question that I am struggling to solve despite looking for help online and experimenting with regexr().
To illustrate my problem, suppose I have variable1 (all in lowercase), which contains a free text response. I want to create a binary variable (variable2) that equals one whenever a combination of keywords is mentioned in variable1.
Suppose I want to make the condition strict so that variable2 is only equal to one when the keyword person is mentioned along with keywords female OR male. This is my current code:
gen variable2 = regexm(variable1, "person* & (female | male)*")
I know this is wrong, but I am struggling to figure out the right way to specify what I want.
I would additionally be grateful if you could help me specify the above expression so that it picks up female, male, and person within words like persons, females, males.
Thanks in advance.
Lili.
I have a question that I am struggling to solve despite looking for help online and experimenting with regexr().
To illustrate my problem, suppose I have variable1 (all in lowercase), which contains a free text response. I want to create a binary variable (variable2) that equals one whenever a combination of keywords is mentioned in variable1.
Suppose I want to make the condition strict so that variable2 is only equal to one when the keyword person is mentioned along with keywords female OR male. This is my current code:
gen variable2 = regexm(variable1, "person* & (female | male)*")
I know this is wrong, but I am struggling to figure out the right way to specify what I want.
I would additionally be grateful if you could help me specify the above expression so that it picks up female, male, and person within words like persons, females, males.
Thanks in advance.
Lili.
Comment