Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Adding string values separated by comma as new variables

    Hi all,

    as the title suggests I would like to add as new observations string values included in a string and separated by a comma.
    So, for instance if I have the following:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input obs str114 atc3_ephmra
    1 "A10H, S1X"
    2 "A10X, C1D, J2A, L4X, D5B"
    3 "B10X, F1A, J2C, S4B, K5D"
    I would like to end up with:
    Code:
    input obs str114 atc3_ephmra
    1 A10H
    2 S1X
    3 A10X
    4 C1D
    5 J2A
    6 L4X
    7 D5B
    8 B10X
    9 F1A
    10 J2C
    11 S4B
    12 K5D


    I guess thee first step is to split at the comma if stripos(atc3_ephmra,","), but I then am unable to go on.
    Any help is appreciated.

    Thank you all.

  • #2
    Thanks for the data example.


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input obs str114 atc3_ephmra
    1 "A10H, S1X"
    2 "A10X, C1D, J2A, L4X, D5B"
    3 "B10X, F1A, J2C, S4B, K5D"
    end 
    
    local y atc3_ephmra 
    split `y', parse(,)
    drop `y'
    
    reshape long `y', i(obs) j(which)
    drop if missing(`y')
    
    list

    Comment


    • #3
      Nick Cox Thank you a lot for the reply. It works perfectly. Just a question: does it work also if there are repeated observations in the data?
      E.g.
      Code:
       * Example generated by -dataex-. To install: ssc install dataex clear input obs str114 atc3_ephmra
      1 "A10H, S1X"
      2 "A10X, C1D, J2A, L4X, D5B"
      3 "A10X, C1D, J2A, L4X, D5B"
      4 "A10X, C1D, J2A, L4X, D5B"
      5 "B10X, F1A, J2C, S4B, K5D"
      6 "B10X, F1A, J2C, S4B, K5D"
      7 "B10X, F1A, J2C, S4B, K5D" end
      Theoretically it should since it is a reshape long using as stubs the splitter values of atc3_ephmra right?

      Comment


      • #4
        It is easy enough to experiment and find out, but yes, repeated values are not a problem there. Or, in principle, nothing in the code drops duplicates (except duplicate missings). Sometimes you want that, but it's an extra step. .

        Comment


        • #5
          Thank you very much for the help!

          Comment

          Working...
          X