Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to split string variables some separated by commas and without commas or space

    Dear all,

    I would like to split string variables for instance if the original variable has an observation such as "p2.1, p3.1 and p4.1", I would like to split it in such a way that three variables are generated
    with values (2.1, 3.1 and 4.1) and should be numeric.

    Please find sample dataset that I am working on below

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str20(m10_10_4_livefences m10_10_4_terracing m10_10_4_mintillage m10_10_4_farmmanure)
    ""     "P2.1,P3.1,P4.1" ""     ""
    ""     "P2.1"           ""     ""
    ""     "P2.1"           ""     ""
    ""     "P2.1,P3.1"      ""     ""
    ""     "P2.1,P3.1"      ""     ""
    ""     "P1.2,P2.1"      "P3.1" ""
    ""     "P2.1,P3.1"      "P4.1" ""
    ""     "P3.1,P4.1"      ""     ""
    ""     "P2.1,P3.1,P4.1" ""     ""
    ""     ""               ""     ""
    "P1.1" ""               ""     ""
    end

    Thanks in Advance

  • #2
    Short answer:
    Code:
    split m10_10_4_terracing, p(",")
    However, this seems as if the data has imply not been imported correctly. If that's the case, perhaps showing a snippet of the original data and the command you used to import is useful.
    Further: if you do the code above, it will create a set of new variables; Stata cannot decide for you if some of these new values should actually have come as values under any of the existing variables (or which)

    Comment


    • #3
      Thanks a lot Jorrit. The data was imported from CS-Pro. After Spllting and I get a variable with an element like p2.1 is it possible to split it further i generate two variables one with p and another one with 2.1. Is it posible?

      Comment


      • #4
        What variations are there? Is it always only a single letter before the numeric part?

        If it is always just one letter, follwoing the split as in the example above, do:
        Code:
        gen letter1 = substr( m10_10_4_terracing1, 1, 1)
        gen number1 = subinstr( m10_10_4_terracing1 , letter1, "",.)
        The first line creates a variable containing the first character of the original string variable, the second line removes the character defined in the first line from your original string variable

        Comment


        • #5
          Thanks a lot @ Jorrit It worked perfectly.

          Comment


          • #6
            Thank you, Jorrit. It worked for me as well after more than 3 years.

            Manish

            Comment

            Working...
            X