Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Destringing multiple variables gives back "contains nonnumeric characters, no replace"

    Hello,

    I require some help with destringing my variables. I have added a screenshot of the problem. [Screenshot](http://i.imgur.com/DOcgE5B.jpg)

    The variables contain numbers but the missing values are not "." as in normal numeric variables. How can i get this to work?

    They should be able to interact with each other. So in very plain words, why are the variables in red actually in red, and not black like the rest?



    Thanks in advance!

  • #2
    Hi Chiel,
    It is difficult to say why all variables cannot be transformed into numeric format, but if you look at the first value in the variable, "sanctiontype", it is clearly not numeric. Instead of a plain 27, it contains two numbers "2" and "7" ("2 7").
    If you still want to transform all the variables into numeric, you could use destring [varlist], replace force. But you risk losing information like the "2 7"
    HTH
    Fernando

    Comment


    • #3
      http://imgur.com/KPfLGWm

      Extra clarifcation

      Comment


      • #4
        There are various things you can do. Suppose strvar is a variable that you think is surely numeric in content. Then

        Code:
        tab strvar if missing(real(strvar))
        shows the values that destring won't change without being forced. It might just be that they exist in as few as one observation.

        Another utility is charlist (SSC) which lists the characters found somewhere in your string variable.

        By the way, screenshots are often difficult to read. We much prefer listings of data, e.g. the results of the command above.

        Comment


        • #5
          This is what Stata returns to me:

          . tab threatendday if missing (real(threatendday))

          missing not found
          r(111);

          A visual inspection of the variable in the data editor shows that all observations are numeric, and a single number - not clustered numbers in one observation.

          Comment


          • #6
            there should be no space between the word "missing" and the following open parenthesis

            Comment


            • #7
              . tab threatendday if missing(real(threatendday))

              sancimposit |
              ionstartday | Freq. Percent Cum.
              ------------+-----------------------------------
              ? | 1 100.00 100.00
              ------------+-----------------------------------
              Total | 1 100.00


              Thank you Rich, this is what I get back from Stata.

              Comment


              • #8
                So, as said earlier, a single value with the character "?" is sufficient to inhibit destring. At a guess, you should just replace it with "." or "" to indicate missing.

                Comment


                • #9
                  or, use the "force" option to -destring-

                  Comment


                  • #10
                    Thanks a lot for the help!

                    Comment


                    • #11
                      This is a bit of a desperation move, only appropriate if you plan on using the data categorically (e.g. order doesn't matter). You could use the -encode- command. This would convert them to numeric, with labels, regardless of whether they were strings or not. Problem is, is that it assigns the values alphabetically, so the numeric values generally wouldn't match the original values that were numeric to begin with. So inadvisable if you have many values and/or most of those values are numeric.

                      Best is to use -tab- or -list- to identify non-numeric values, then use the -replace- command to clean things up (replace sactiontype="27" if sanctiontype="2 7"), then run -destring-. Also the most tedious.

                      The "force" option for -destring- can be useful (and very efficient!) if it's only a few cases and/or there are no logical numeric values you could possibly convert the string values into.

                      Comment


                      • #12
                        Ben includes a careful warning here "if you plan on using the data categorically" and I just want to underline that such a condition is essential.

                        I've seen people get in frightful messes when they have a string variable with (almost all) numeric content, but encode in desperation. The results can be complete garbage. I will be dogmatic and say that it's almost always a very bad idea.

                        Consider for example that strings "1" "12" "13000" "141" could happily be encoded to 1, 2, 3, 4, 5 as that is their alphanumeric order.

                        As the original author of destring I really do want to underline that there is very good reason why there are two completely different commands. Despite a superficial similarity, the problems they solve are quite different. destring is designed to make it difficult for you to ignore details that might be important in your data. We (mostly StataCorp) put a lot of work into making that difficult.

                        Comment


                        • #13
                          Ah. I had a look at your first screenshot (had originally only looked at the second). Yup, forget about -encode-. They're close enough to numeric that -destring-, potentially combined with a little clean-up and the "force" option is infinitely preferable. Sorry for making matters more confusing.

                          Comment


                          • #14
                            If any variable(string)'s value are written as "4V1903496". Is it possible to make the variable numeric?
                            If not what should I do to make it numeric?

                            Comment


                            • #15
                              What does that example mean? Why you want to make it numeric? Please give more context.

                              Comment

                              Working...
                              X