Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to delete the first two letters from a string variable?

    Hello Mr Mrs Help

    I have a variable called bvid. It is in a form of either abxxxxxxxx or xxxxxxxxxx (x represents numbers).

    I would like to delete the first two letters if the variable is in a form of abxxxxxxx, and I would like to keep those bvid which are already in the form of xxxxxx intact. The two letters are not necessarily "ab", it can be other letters.

    Thanks

  • #2
    A schematic example helps, but a concrete example would be even better. Consider this token:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str6 foo
    "ab1234"
    "xy5678"
    "1234"
    "5678"
    end
    
    gen wanted = cond(real(foo) < ., real(foo), real(substr(foo, 3, .)))
    
    list
    
         +-----------------+
         |    foo   wanted |
         |-----------------|
      1. | ab1234     1234 |
      2. | xy5678     5678 |
      3. |   1234     1234 |
      4. |   5678     5678 |
         +-----------------+
    That was a lot of code in one line. Let's break it down. If the input is entirely numeric characters, then pushing it through real() will yield a number that isn't missing. Otherwise, we need to start at position 3, and push that substring through real().

    Code:
    help string functions
    is where you should start reading. Then

    Code:
    help cond()
    Last edited by Nick Cox; 23 Jan 2019, 12:58.

    Comment


    • #3
      You could also do it with strkeep (SSC install strkeep) as long as what you want to do is remove all letters and just keep the numbers.

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str10 bvid
      "AB45924402"
      "XY46680007"
      "655917365"
      "581737958"
      "867536939"
      "633816792"
      "CD3539474"
      "FZ8207157"
      "464776975"
      "836536420"
      "GT2990886"
      "339758327"
      "804455743"
      "289410895"
      end
      
      strkeep bvid, gen(new_id) numeric  // removes all letters and keeps only the numbers
      
      list, noobs
      
        +------------------------+
        |       bvid      new_id |
        |------------------------|
        | AB45924402    45924402 |
        | XY46680007    46680007 |
        |  655917365   655917365 |
        |  581737958   581737958 |
        |  867536939   867536939 |
        |------------------------|
        |  633816792   633816792 |
        |  CD3539474     3539474 |
        |  FZ8207157     8207157 |
        |  464776975   464776975 |
        |  836536420   836536420 |
        |------------------------|
        |  GT2990886     2990886 |
        |  339758327   339758327 |
        |  804455743   804455743 |
        |  289410895   289410895 |
        +------------------------+

      Comment


      • #4
        Thanks Nick and David. Though I didn't post the question I have learnt a lot.

        Comment


        • #5
          Thanks, Nick and David, I appreciate your help.

          Comment

          Working...
          X