Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generate and replace multiple variables based on original variables.

    Hello Statalist,

    Hope everyone is doing fine in the middle of this pandemic.

    I'm working with a data set from a questionnaire that have 70 questions in a 5 point Likert Scale: Strongly Disagree, Disagree, Neutral, Agree, and Strongly Agree". When the survey was developed, they make the mistake to misspelled some of the options in the answer questions and there exist multiple values for the same option (Strongly Disagree, Strongly disagree, strongly disagree, etc.). I'm mostly new to work with Stata and I do not know how to work successfully with loops and macros.


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str14(q8 q9) str17(q10 q11)
    ""               ""               ""                  ""                 
    "Neutral"        "Disagree"       "Neutral"           "Agree"            
    "Strongly agree" "Strongly Agree" "Agree"             "Agree"            
    "Agree"          "Strongly Agree" "Agree"             "Agree"            
    "Agree"          "Agree"          "Neutral"           "Neutral"          
    "Agree"          "Disagree"       "Neutral"           "Neutral"          
    "Agree"          "Disagree"       "Agree"             "Neutral"          
    "Neutral"        "Neutral"        "Strongly Disagree" "Strongly disagree"
    "Neutral"        "Neutral"        "Neutral"           "Neutral"          
    "Neutral"        "Neutral"        "Neutral"           "Neutral"          
    "Agree"          "Agree"          "Disagree"          "Neutral"          
    "Agree"          "Agree"          "Agree"             "Agree"            
    ""               ""               ""                  ""                 
    "Agree"          "Agree"          "Neutral"           "Neutral"          
    "Strongly agree" "Strongly agree" "Strongly agree"    "Strongly agree"   
    "Agree"          "Agree"          "Agree"             "Agree"            
    "Agree"          "Agree"          "Agree"             "Agree"            
    "Strongly agree" "Strongly agree" "Strongly agree"    "Strongly agree"   
    "Agree"          "Agree"          "Agree"             "Agree"            
    "Neutral"        "Agree"          "Agree"             "Strongly disagree"
    ""               ""               ""                  ""                 
    "Agree"          "Agree"          "Agree"             "Agree"            
    "Agree"          "Agree"          "Agree"             "Agree"            
    "Neutral"        "Agree"          "Agree"             "Agree"            
    "Strongly agree" "Strongly agree" "Strongly agree"    "Strongly agree"   
    "Strongly agree" "Strongly agree" "Strongly agree"    "Strongly agree"   
    "Agree"          "Strongly agree" "Agree"             "Agree"            
    "Strongly agree" "Strongly agree" "Strongly agree"    "Strongly agree"   
    "Strongly agree" "Strongly agree" "Strongly agree"    "Strongly agree"   
    "Strongly agree" "Strongly agree" "Strongly agree"    "Strongly agree"   
    end

    My intention is to generate the same numbers of variables and replace the variables with the correct values, according to the string values in the old variables. One by one will be something like:

    gen Q8=.
    replace Q8 =1 if q8=="Strongly Disagree"
    replace Q8 =1 if q8=="Strongly disagree"
    replace Q8 =1 if q8=="strongly disagree"
    replace Q8 =2 if q8=="disagree"
    replace Q8 =2 if q8=="Disagree"
    replace Q8 =3 if q8=="Neutral"
    replace Q8 =3 if q8=="Neautral"
    replace Q8 =4 if q8=="Agree"
    replace Q8 =4 if q8=="agree"
    replace Q8 =5 if q8=="Strongly agree"
    replace Q8 =5 if q8=="Strongly Agree"
    replace Q8 =5 if q8=="strongly agree"
    label define agree 1 "Strongly Disagree" 2 "Disagree" 3 "Neutral" 4 "Agree" 5 "Strongly Agree", replace
    label values Q8 q8


    I'm trying to do something similar with macros and loops. The following were my unsuccessful steps.

    Step 1. (Successful)

    forvalues i = 8/78 {
    gen var`i'=.

    }


    Step 2: (Successful)

    local continuous q8-q78

    Step 3: (Unsuccessful)

    foreach var of varlist var8-var78 {
    replace `var'=1 if `continuous'=="Strongly Disagree"
    replace `var'=2 if `continuous'==" Disagree"
    replace `var'=3 if `continuous'=="Neutral"
    replace `var'=3 if `continuous'=="Neautral"
    replace `var'=4 if `continuous'=="Agree"
    replace `var'=4 if `continuous'=="agree"
    replace `var'=5 if `continuous'=="Strongly agree"
    replace `var'=5 if `continuous'=="Strongly Agree"

    }

    label define agree 1 "Strongly Disagree" 2 "Disagree" 3 "Neutral" 4 "Agree" 5 "Strongly Agree", replace
    label values Q8-Q59 agree



    What do you recommend me to do without going one by one?
    Thanks in advance.

    Felix.




  • #2
    I'm a little confused but I think the following, which does not produce new variables, will at least get you started:
    Code:
    . foreach var of varlist q* {
      2. replace `var'=proper(`var')
      3. }
    at that point you can encode these, if you define a label first and include the label option to the -encode- command, all will be "numbered" consistently

    Comment


    • #3
      Originally posted by Rich Goldstein View Post
      I'm a little confused but I think the following, which does not produce new variables, will at least get you started:
      Code:
      . foreach var of varlist q* {
      2. replace `var'=proper(`var')
      3. }
      at that point you can encode these, if you define a label first and include the label option to the -encode- command, all will be "numbered" consistently

      Thank you Rich for your quick answered. What I'm trying to do is to create new categorical variables and replace those new variables with the correct category according to the string value in the old variables. I tried to use -multencode- command to encode all the variables but that convert the errors in the values of the old variables too. I understand that in order to fix the problem correctly the best approach will be to create a new categorical variable and replace that variable with the corresponding old variables string values.

      Hope that is a little clear.


      Comment


      • #4
        did you try the code I suggested - that should make the capitalization differences all go away; there may be other issues but they are not obvious in your example data

        Comment


        • #5
          Originally posted by Rich Goldstein View Post
          did you try the code I suggested - that should make the capitalization differences all go away; there may be other issues but they are not obvious in your example data
          Yes, Thank you. That works to fix the capitalization. Now, I'm dealing with the misspelling.

          Comment

          Working...
          X