Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Variable destring, forcing. Removing commas, dots and spaces

    Hi,

    I have two variables, "commodity" and "msci" where I have 4905 observations per variable.

    In my previous post I got help to destring my variable "comprice" (which is now my variable called "commodity").

    I run -destring commodity, replace(" ") dpcomma- and thereafter run: -destring commodity, replace ignore (" ") dpcomma force-

    This worked out pretty well, unless around for around 150 observations where I had to remove spaces to avoid the restringing and forcing change 150 observations to "(.)" for my "commodity" variable.

    The problem occurs when I am trying to do the same for my "msci" variable.
    When I list the variable by -list msci- the numbers is presented all like for instance:
    2,367.27
    (unlike for -list commodity- which gives a number for instance like:
    10709,89)


    I have tried to run -destring msci, replace ignore(" ") dpcomma-

    which gives me the result:
    msci: contains characters not specified in ignore(); no replace

    and thereafter run: -destring msci, replace ignore(" ") dpcomma force-

    which gives me the result:
    msci: contains nonnumeric characters not specified in ignore(); replaced as byte
    (4905 missing values generated)

    Then all my observations for the variable msci looks like:
    (.)

    4905 observations are too many to manually enter the "right" characters for Stata to accept.

    Both variables are exported from excel.

    So, how may I solve this problem? Can anyone please help me with the full command eventually?

    Regards
    Guest
    Last edited by sladmin; 16 Nov 2020, 05:35. Reason: anonymize original poster

  • #2
    Provide a reproducible example as recommended in FAQ Advice #12. On the face of it, it appears that settling on a consistent format, e.g., period-decimal, should work.

    Code:
    input str15(index commodity)
    "2,367.27" "2367,27"
    "1,111.13" "1111,13"
    end
    
    destring index, ignore(",") replace
    destring commodity, dpcomma replace
    l
    Res.:

    Code:
    . l
    
         +-------------------+
         |   index commodity |
         |-------------------|
      1. | 2367.27   2367.27 |
      2. | 1111.13   1111.13 |
         +-------------------+

    Comment


    • #3
      Thank you very much for your answer, Andrew.

      The first 4 rows from the first block with codes you pasted are creating new variables, which I do not want. Maybe I have explained myself badly.

      Because it seems like it worked out for me by only using:

      destring commodity, replace ignore(" ") dpcomma

      and

      destring index, ignore(",") replace
      destring index, ignore(",") replace force

      Guest
      Last edited by sladmin; 16 Nov 2020, 05:36. Reason: anonymize original poster

      Comment


      • #4
        I don't follow all of this. As Andrew Musau explains we need to see a data example, which is still lacking. If one destring works then next time round the variable in question is numeric and a subsequent destring on the same variable will change nothing.

        Comment

        Working...
        X