Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • destring, ignore

    I have been trying for hours and also searching for assistance on using the destring, ignore command since I have a varlist "ph_pta" that contains pH values ( should be formatted: "7.40" as an example"

    Some of the observations have "<" or random other characters that I have tried directly typing the command in the line or even using the Data- convert string var to numeric var menu. I have 6 or 7 characters that I've manually found that need to be removed so that I can then proceed with destring and convert this varlist to numbers.
    I've tried ignore("<" "-" "`"), which is how I've seen it is supposed to be formatted?
    I've also tried with commas in between each of those.
    The error I most often get is "too few quotes. It adds its own quotes & `' each time also, when I use the menu.

    Please assist. I can give you more info too if needed.

  • #2
    You are trying to remove ` characters. The problem is that when writing Stata code, ` has a special meaning: it opens the reference to a local macro. So when Stata sees that in your -ignore()- option, it expects that it will be followed by the name of a local macro, and then by a closing apostrophe (') character. Since there is no closing apostrophe to balance the `, you are getting that error message. The simplest way to get around this is to first remove the ` characters separately with:
    Code:
    replace varname = subinstr(varname, "`", "", .)
    The -subinstr()- function, for some reason, seems to handle this the way you expect. After that, do your -destring- without any mention of the ` character.

    Please note that it is the norm in this community that we use our real first and last names as our username, to promote collegiality and professionalism. Unfortunately, the Forum software does not permit you to edit your username once your account is set up. But you can click on Contact Us in the lower right hand corner of this page and send a message to the system administrator requesting the change. Please do so as soon as possible. Thank you.


    Comment


    • #3
      Thank you! I will try--and will change the UN.

      Comment


      • #4
        I tried your suggestion and it worked! Makes sense.
        Interestingly, I had observations with commas in them that should be decimals, and changing them over to decimals was part of my eventual cleaning process. I assumed that I could use the dpcomma option after citing the characters for destring, replace ignore() and that it would change the commas into decimals.
        However, it gave me the error that characters were present in the observations that were not accounted for in ignore(), indicating I had not specified a certain character still. I separately removed the commas in addition to "`" using subinstr() and then ran destring, replace ignore() and it worked.
        Any insight as to why I could not run both options at once?
        Thanks again.

        Comment


        • #5
          I don't know why you experienced that. I experimented with some toy data that had some comma-decimal points and some other non-numeric characters, and -destring, dpcoma ignore()- with the appropriate non-numeric characters specified in -ignore()- worked the way one would expect. Are you sure there wasn't some other non-numeric character that you had forgotten to include in the -ignore()- option?

          Comment


          • #6
            You might try charlist (SSC); easy to have "ghost" characters buried in strings (ASCII 160 is a prime offender).

            Code:
            charlist varname
            return list
            display r(ascii)
            __________________________________________________ __
            Assistant Professor, Department of Biostatistics and Epidemiology
            School of Public Health and Health Sciences
            University of Massachusetts- Amherst

            Comment

            Working...
            X