Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • spliting the variable to two parts

    Hello every one,
    I hope you are doing very well.
    There is a variable containing letters and numbers. I wanted to label value the observation for example "AF01" should be teacher and so on. when I run the command label value it errors invalid syntax.
    So I wanted to try another way. I split the variable in to two parts, one part includes the letters for example (Af ) and the other part includes the numbers like (01). like wise, when I run the split command it does not split the variable in to two parts. And then I should label value the destring part. Please anyone help me to solve the problem. Thank you in advance.
    here is the dataex created example below.
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str4 Pcode
    "AF17"
    "AF17"
    "AF17"
    "AF15"
    "AF17"
    "AF17"
    "AF17"
    "AF17"
    "AF17"
    "AF17"
    "AF15"
    "AF15"
    "AF17"
    "AF01"
    "AF01"
    "AF17"
    "AF17"
    "AF18"
    "AF18"
    "AF15"
    "AF17"
    "AF17"
    "AF17"
    "AF17"
    "AF17"
    "AF17"
    "AF18"
    "AF18"
    "AF18"
    "AF18"
    "AF17"
    "AF17"
    "AF17"
    "AF18"
    "AF18"
    "AF15"
    "AF17"
    "AF17"
    "AF17"
    "AF17"
    "AF17"
    "AF17"
    end

  • #2
    For your data example,
    Code:
    gen letters = substr(Pcode, 1, 2)
    
    gen numbers = real(substr(Pcode, 3. .))
    will work.

    split is an official command but as its previous author I can comment on its design. It was and is intended to split strings that include punctuation (wide sense) delimiting substrings.

    Splitting strings without punctuation is a different problem, usually calling for spelling out where substrings begin and end or for regular expression (regex) syntax.

    moss from SSC is one wrapper for regex functions, and you may naturally use regex functions directly.



    Code:
    clear
    input str5 Pcode
    "AF17"
    "JB007"
    end 
    
    moss Pcode,  match("([0-9]+)") regex 
    
    list 
    
         +----------------------------------+
         | Pcode   _count   _match1   _pos1 |
         |----------------------------------|
      1. |  AF17        1        17       3 |
      2. | JB007        1       007       3 |
         +----------------------------------+

    Comment


    • #3
      For your underlying problem, consider also https://www.stata.com/support/faqs/d...s-for-subsets/

      Comment


      • #4
        Thank you Nick Cox for your respond. The first command woked well and solved my problem.

        Comment

        Working...
        X