Generate and replace multiple variables based on original variables.

Felix Quinones

Join Date: Nov 2017
Posts: 6

Generate and replace multiple variables based on original variables.

08 May 2020, 07:13

Hello Statalist,

Hope everyone is doing fine in the middle of this pandemic.

I'm working with a data set from a questionnaire that have 70 questions in a 5 point Likert Scale: Strongly Disagree, Disagree, Neutral, Agree, and Strongly Agree". When the survey was developed, they make the mistake to misspelled some of the options in the answer questions and there exist multiple values for the same option (Strongly Disagree, Strongly disagree, strongly disagree, etc.). I'm mostly new to work with Stata and I do not know how to work successfully with loops and macros.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str14(q8 q9) str17(q10 q11)
""               ""               ""                  ""                 
"Neutral"        "Disagree"       "Neutral"           "Agree"            
"Strongly agree" "Strongly Agree" "Agree"             "Agree"            
"Agree"          "Strongly Agree" "Agree"             "Agree"            
"Agree"          "Agree"          "Neutral"           "Neutral"          
"Agree"          "Disagree"       "Neutral"           "Neutral"          
"Agree"          "Disagree"       "Agree"             "Neutral"          
"Neutral"        "Neutral"        "Strongly Disagree" "Strongly disagree"
"Neutral"        "Neutral"        "Neutral"           "Neutral"          
"Neutral"        "Neutral"        "Neutral"           "Neutral"          
"Agree"          "Agree"          "Disagree"          "Neutral"          
"Agree"          "Agree"          "Agree"             "Agree"            
""               ""               ""                  ""                 
"Agree"          "Agree"          "Neutral"           "Neutral"          
"Strongly agree" "Strongly agree" "Strongly agree"    "Strongly agree"   
"Agree"          "Agree"          "Agree"             "Agree"            
"Agree"          "Agree"          "Agree"             "Agree"            
"Strongly agree" "Strongly agree" "Strongly agree"    "Strongly agree"   
"Agree"          "Agree"          "Agree"             "Agree"            
"Neutral"        "Agree"          "Agree"             "Strongly disagree"
""               ""               ""                  ""                 
"Agree"          "Agree"          "Agree"             "Agree"            
"Agree"          "Agree"          "Agree"             "Agree"            
"Neutral"        "Agree"          "Agree"             "Agree"            
"Strongly agree" "Strongly agree" "Strongly agree"    "Strongly agree"   
"Strongly agree" "Strongly agree" "Strongly agree"    "Strongly agree"   
"Agree"          "Strongly agree" "Agree"             "Agree"            
"Strongly agree" "Strongly agree" "Strongly agree"    "Strongly agree"   
"Strongly agree" "Strongly agree" "Strongly agree"    "Strongly agree"   
"Strongly agree" "Strongly agree" "Strongly agree"    "Strongly agree"   
end

My intention is to generate the same numbers of variables and replace the variables with the correct values, according to the string values in the old variables. One by one will be something like:

gen Q8=.
replace Q8 =1 if q8=="Strongly Disagree"
replace Q8 =1 if q8=="Strongly disagree"
replace Q8 =1 if q8=="strongly disagree"
replace Q8 =2 if q8=="disagree"
replace Q8 =2 if q8=="Disagree"
replace Q8 =3 if q8=="Neutral"
replace Q8 =3 if q8=="Neautral"
replace Q8 =4 if q8=="Agree"
replace Q8 =4 if q8=="agree"
replace Q8 =5 if q8=="Strongly agree"
replace Q8 =5 if q8=="Strongly Agree"
replace Q8 =5 if q8=="strongly agree"
label define agree 1 "Strongly Disagree" 2 "Disagree" 3 "Neutral" 4 "Agree" 5 "Strongly Agree", replace
label values Q8 q8

I'm trying to do something similar with macros and loops. The following were my unsuccessful steps.

Step 1. (Successful)

forvalues i = 8/78 {
gen var`i'=.

}

Step 2: (Successful)

local continuous q8-q78

Step 3: (Unsuccessful)

foreach var of varlist var8-var78 {
replace `var'=1 if `continuous'=="Strongly Disagree"
replace `var'=2 if `continuous'==" Disagree"
replace `var'=3 if `continuous'=="Neutral"
replace `var'=3 if `continuous'=="Neautral"
replace `var'=4 if `continuous'=="Agree"
replace `var'=4 if `continuous'=="agree"
replace `var'=5 if `continuous'=="Strongly agree"
replace `var'=5 if `continuous'=="Strongly Agree"

}

label define agree 1 "Strongly Disagree" 2 "Disagree" 3 "Neutral" 4 "Agree" 5 "Strongly Agree", replace
label values Q8-Q59 agree

What do you recommend me to do without going one by one?
Thanks in advance.

Felix.

Tags: None

Rich Goldstein

Join Date: Mar 2014

Posts: 4470
#2

08 May 2020, 07:32

I'm a little confused but I think the following, which does not produce new variables, will at least get you started:

Code:

. foreach var of varlist q* { 2. replace `var'=proper(`var') 3. }

at that point you can encode these, if you define a label first and include the label option to the -encode- command, all will be "numbered" consistently
Comment
Felix Quinones

Join Date: Nov 2017

Posts: 6
#3

08 May 2020, 07:44

Originally posted by Rich Goldstein View Post

I'm a little confused but I think the following, which does not produce new variables, will at least get you started:

Code:

. foreach var of varlist q* { 2. replace `var'=proper(`var') 3. }

at that point you can encode these, if you define a label first and include the label option to the -encode- command, all will be "numbered" consistently

Thank you Rich for your quick answered. What I'm trying to do is to create new categorical variables and replace those new variables with the correct category according to the string value in the old variables. I tried to use -multencode- command to encode all the variables but that convert the errors in the values of the old variables too. I understand that in order to fix the problem correctly the best approach will be to create a new categorical variable and replace that variable with the corresponding old variables string values.

Hope that is a little clear.
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4470
#4

08 May 2020, 08:15

did you try the code I suggested - that should make the capitalization differences all go away; there may be other issues but they are not obvious in your example data
Comment
Felix Quinones

Join Date: Nov 2017

Posts: 6
#5

08 May 2020, 09:07

Originally posted by Rich Goldstein View Post

did you try the code I suggested - that should make the capitalization differences all go away; there may be other issues but they are not obvious in your example data

Yes, Thank you. That works to fix the capitalization. Now, I'm dealing with the misspelling.
Comment

Announcement

Generate and replace multiple variables based on original variables.

Comment

Comment

Comment

Comment