Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Destring error: r(134) too many values

    Hi I am trying to destring my values which contain a mixture of numbers and letters - see below

    ID
    0000366CAX878
    00000HG78888
    8787781HG0101

    As it's a mixture of numbers and letters I am using the 'destring' function

    Code:
    destring ID, gen(ID_N) force

    However, this generated a couple of thousand missing values - I have 1mill records

    The same amount of missing values were generated with this code:
    gen (ID_N) = real(ID)

    How do I go about this problem?
    ____

    Of course, I didn't use the encode option as I have 1mil records and although this would be the wrong function as it the table ID includes letters there are 'too many values'

  • #2
    -destring- should only be used when the variable is (basically) numeric; see
    Code:
    h encode
    if you really need a numeric version; however, for virtually all purposes within Stata, the ID can be a string variable so why do you want it to be numeric?

    Comment


    • #3
      Martin:
      you may want to try:
      Code:
      . egen wanted=group( ID )
      
      . list
      
           +------------------------+
           |            ID   wanted |
           |------------------------|
        1. | 0000366CAX878        2 |
        2. |  00000HG78888        1 |
        3. | 8787781HG0101        3 |
           +------------------------+
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment


      • #4
        If you need a numeric identifier, then #3 is the only strategy if destring and encode will not deliver the goods. It is equivalent to

        Code:
        sort ID
        gen wanted = sum(id != ID[_n-1])
        Last edited by Nick Cox; 20 Jul 2022, 10:51.

        Comment

        Working...
        X