Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using Loops to Convert Strings With Letters to Numeric Values

    Hello Stata Users :-)

    I have a dataset comprised of 71 variables containing responses to items on a satisfaction survey.

    All of the variables are measured on a 5-point Likert scale. Most of the variables are stored as numbers, but many are stored as strings.

    Missing values for variables stored as numbers show up as "."

    Missing values for variables stored as strings show up as "X"

    Question: How can I write a looping command to change "X" values contained in string variables to "." values contained in numeric variables?

    The code below gives a "type mismatch"

    Code:
    foreach var of varlist q1-q71 {
        replace `var' = "." if `var'=="X"
    }
    My hunch is that I must use an "if" statement to separate out string from numeric variables, but I have little experience writing loops within "if" statements. Any help would be very much appreciated.

    Again, a sample of the data is below.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str1(q15 q16 q17 q18 q19)
    "X" "4" "5" "4" "X"
    "1" "5" "5" "2" "1"
    "5" "5" "5" "5" "5"
    "5" "5" "5" "5" "5"
    "2" "4" "4" "4" "4"
    "4" "4" "3" "2" "3"
    "5" "2" "3" "1" "3"
    "4" "5" "4" "5" "5"
    "5" "5" "5" "4" "5"
    "X" "2" "3" "2" "X"
    end
    Thanks so much,
    Adam

  • #2
    Is your goal to convert all the variables to numeric variables? You could use destring for that. You'll still have to deal with the fact that some of your variables are numeric and some are string (as indicated in the text of your question but not in the example data you provide).

    This technique should work. Note that I generate a new variable so you can verify that the code does as you wish, but you could use the replace option on destring once you're satisfied you'll get what you want.

    Code:
    ds q15-q19, has(type string)
    foreach var in `r(varlist)' {
        destring `var', ignore("X") gen(`var'_n)
    }
    For future questions, you'll find that it's easier to get useful advice when your example data closely matches your actual data. In this case a simple loop will work on your example data since all variables are string variables. My impression is that would not work on your real data due to a sloppy mix of variable types in the original data.

    Comment


    • #3
      Thanks for sharing the "ds" command. Yes, it was a sloppy mix of types that caused the problem, as clearly outlined in my post. I found the solution below, too, for those of you who are interested in more detail.

      https://stats.idre.ucla.edu/stata/fa...ericvariables/

      Comment

      Working...
      X