Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • help with reading numbers in different format (number, range, range of negative number in brackets,...)

    Hello, STATALIST users,

    I have this Stata data file read from an excel file.
    CIK RE
    1018724 650000000
    1065088 -20000000
    712515 600,000,000 - 800,000,000
    92380 (1,000,000,000) - (1,200,000,000)
    In the RE column, some of the numbers are in a range, and some are in a range of negative numbers (brackets indicating negative numbers).

    I want to turn them into numbers, and in case of a range, get the mean of the range.

    I would appreciate any help you can offer.

    Thanks so much!

    Jessica

  • #2
    Stata will have have at least 3 problems reading an ASCII file like that with -insheet- or -import delimited-.

    1) commas delineating 1,000s. The commas will be seen as delimiting values.
    2) Parenthesis denoting negatives. Stata doesn't know about that convention.
    3) In the last row what is the role of (1,000,000,000)? It looks like you have 3 values there and one surplus minus sign.

    You may need to do some hand editing.

    Comment


    • #3
      Code:
      destring CIK, replace ignore(",")
      replace RE = subinstr(RE,",","",.)
      replace RE = ustrregexra(RE,"\((\d*?)\)","-$1")
      gen RE2 = real(RE)
      replace RE2 = (real(ustrregexs(1))+real(ustrregexs(2)))/2 if ustrregexm(RE,"(-?\d*?) - (-?\d*?)$") & RE2==.

      Comment


      • #4
        Thank you so much, Ali Atia.

        It works like magic!!

        Comment

        Working...
        X