Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using detring to remove special characters

    Hi all,

    I'm trying to remove special characters from a string variable. I have two variables. One contains only "-", another contains "-" and "+".
    I successfully removed "-" from the first variable. When I try to remove "-" and "+" together, The Stata shows 'option "+ not allowed'. Here is my code: destring var, gen(new) ignore("-", "+"). How can I fix this? Thank you!

  • #2
    Replace
    Code:
    destring var, gen(new) ignore("-", "+")
    with
    Code:
    destring var, gen(new) ignore("-" "+")
    removing the comma between the two strings to ignore. See the ouput of help desetring for further details on the syntax of the ignore option on the destring command.

    Comment


    • #3
      Hi William,

      I tried the code. Now the error is "var contains characters not specified in ignore(); no generate". I don't see any other special characters. How can I find out what the characters are? Thank you!

      Comment


      • #4
        Code:
        tab whatever if missing(real(whatever))
        will show you values that can't be converted to numeric. See also chartab (SSC).

        Comment


        • #5
          Building on Nick's advice, I'm concerned about ignoring "-". Consider the following example.
          Code:
          . destring x, ignore("-" "+") generate(y)
          x: characters - + removed; y generated as byte
          (2 missing values generated)
          
          . list, noobs
          
            +--------+
            |  x   y |
            |--------|
            |  1   1 |
            | -2   2 |
            | +3   3 |
            |  -   . |
            |  +   . |
            +--------+
          
          . generate xx = x
          
          . replace xx = "" if trim(xx)=="-"
          (1 real change made)
          
          . destring xx, ignore("+") generate(z)
          xx: character + removed; z generated as byte
          (2 missing values generated)
          
          . list, noobs
          
            +------------------+
            |  x   y   xx    z |
            |------------------|
            |  1   1    1    1 |
            | -2   2   -2   -2 |
            | +3   3   +3    3 |
            |  -   .         . |
            |  +   .    +    . |
            +------------------+
          We see that by ignoring "-" the negative 2 was destringed as a positive 2.

          Certainly Nick's advice is the place to start. Systematically review your data. Be careful about ignoring non-numeric characters that are in fact part of numbers.

          Comment

          Working...
          X