Identifying specific responses from a list

Ashani Abayasekara

Join Date: May 2023

Posts: 106
#1

Identifying specific responses from a list

01 Jan 2024, 05:55

Hi everyone,

Happy new year!

I have a question relating to identifying a specific response option from a list of possible responses. The specific question asks respondents to choose from a list of fields of study. There are around 15 options and respondents are asked to select all that apply to them (e.g. science, law, management, economics etc).

I need to create an indicator variable which equals 1 if a respondent has studied economics (among other fields). Is there a simple way to identify if the field of economics is chosen? Since respondents can select multiple fields, and they appear in string format, it's a bit tricky to single out those who have studied economics (for those who have selected multiple fields). I'm typically looking for a command where I can generate a variable that equals one if economics appears anywhere in the response).

These are some example responses:

1. Agriculture, environment studies, economics
2. Economics, management, accounting
3. Business studies, accounting, economics
4. Natural and physical sciences
5. Management, information technology

I need to generate a variable that would equal one for the first three responses, where the field economics is chosen.

Thank you very much,
Ashani.
Tags: None

1 like

Andrew Musau

Join Date: Oct 2014
Posts: 10216

01 Jan 2024, 07:21

See

Code:

help strpos()

Code:

help regexm

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input str43 program
"Agriculture, environment studies, economics"
"Economics, management, accounting"          
"Business studies, accounting, economics"    
"Natural and physical sciences"              
"Management, information technology"         
end

gen wanted1= strpos(lower(program), "economics")>0
gen wanted2= regexm(lower(program), "economics")

Res.:

Code:

. l

     +-----------------------------------------------------------------+
     |                                     program   wanted1   wanted2 |
     |-----------------------------------------------------------------|
  1. | Agriculture, environment studies, economics         1         1 |
  2. |           Economics, management, accounting         1         1 |
  3. |     Business studies, accounting, economics         1         1 |
  4. |               Natural and physical sciences         0         0 |
  5. |          Management, information technology         0         0 |
     +-----------------------------------------------------------------+

.

Comment

Ashani Abayasekara

Join Date: May 2023

Posts: 106
#3

01 Jan 2024, 09:01

Thanks so much Andrew. This worked, I used only the regexm command.

gen econ=(regexm(field, "Econ") == 1)
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35724
#4

01 Jan 2024, 09:31

Andrew Musau’s point is misread in #3 as you should surely want to check for “econ” too.

Less importantly the ==1 part is redundant as the function already returns 1 or 0.
1 like
Comment
Ashani Abayasekara

Join Date: May 2023

Posts: 106
#5

01 Jan 2024, 22:46

Thanks Nick. Actually all fields appear with the first letter in capitals, so there won't be any "econ" responses. Apologies for the error when typing out the example responses.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35724
#6

02 Jan 2024, 02:27

Andrew Musau’s code remains not only a good reaction to your example, but also better code.

Thanks for the explanation, however!
1 like
Comment

Announcement

Identifying specific responses from a list

Comment

Comment

Comment

Comment

Comment