Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to split a variable with unequal length into multiple variables

    I have a variable named var_0 with responses ranging from missing to maximum length 3. It has no delimiters. I would like to split it into 3 variables (var_1, var_2 and var_3) as shown in the attachment. How would I efficiently go about it?
    Luke Korir
    Attached Files

  • #2
    Some of us can't read spreadsheet attachments and many of us won't read them. The reasons why not are explained at https://www.statalist.org/forums/help#stata 12.5.

    Please post a readable data example as explained at the link above.

    Comment


    • #3
      Here is the CODE:

      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str3 var_0 float(var_2 var_3 var_4)
      "012" 0 1 2
      "16" 1 6 .
      "4" 4 . .
      "" . . .
      "14" 1 4 .
      "01" 0 1 .
      "014" 0 1 4
      "024" 0 2 4
      "06" 0 6 .
      "012" 0 1 2
      "01" 0 1 .
      "0" 0 . .
      "014" 0 1 4
      "016" 0 1 6
      "6" 6 . .
      "03" 0 3 .
      "34" 3 4 .
      "013" 0 1 3
      "014" 0 1 4
      "136" 1 3 6
      "014" 0 1 4
      end

      Comment


      • #4
        I dont know whether this is the most efficient way of doing it, but I think it is doing it:

        Code:
        . gen myvar1 = substr(var_0,1,1)
        (1 missing value generated)
        
        . gen myvar2 = substr(var_0,2,1)
        (4 missing values generated)
        
        . gen myvar3 = substr(var_0,3,1)
        (11 missing values generated)
        
        . destring myvar1 myvar2 myvar3, replace
        myvar1: all characters numeric; replaced as byte
        (1 missing value generated)
        myvar2: all characters numeric; replaced as byte
        (4 missing values generated)
        myvar3: all characters numeric; replaced as byte
        (11 missing values generated)
        
        . assert var_2== myvar1
        
        . assert var_3 == myvar2
        
        . assert var_4 == myvar3

        Comment


        • #5
          Thank you Joro, Nick. That worked.

          Comment


          • #6
            The approach in #4 can be generalised to


            Code:
            local length = real(substr("`: type var_0'", 4, .))
            
            forval j = 1/`length' {
                gen wanted`j' = real(substr(var_0, `j', 1))
            }
            If you know the length of the string variable, then you can cut straight to something like

            Code:
            forval j = 1/3 {
                gen wanted`j' = real(substr(var_0, `j', 1))
            }



            Comment

            Working...
            X