Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Separating string variables with asterisk into two words

    Hi,
    I have variable x1. I am trying to create x2 and x3 as given in the dataex.
    I used

    Code:
    split x1, generate(x2) parse(*)
    It created two variables, but second split variable was empty.


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str7 x1 float x2 str3 x3
    "1.58***" 1.58 "***"
    ".48"      .48 ""   
    ".82***"   .82 "***"
    ".51*"     .51 "**" 
    ".87***"   .87 "***"
    "-.47*"   -.47 "*"  
    ".45***"   .45 "***"
    ".14**"    .14 "**" 
    ".17"      .17 ""   
    "1.11"    1.11 ""   
    end
    Kindly help.

  • #2
    Maybe better via an intermediary variable. See below, beginning at the "Begin here" comment.

    .ÿ
    .ÿversionÿ16.1

    .ÿ
    .ÿclearÿ*

    .ÿ
    .ÿquietlyÿinputÿstr7ÿx1ÿfloatÿx2ÿstr3ÿx3

    .ÿ
    .ÿdropÿx2ÿx3

    .ÿ
    .ÿ*
    .ÿ*ÿBeginÿhere
    .ÿ*
    .ÿgenerateÿstrÿx1aÿ=ÿsubinstr(x1,ÿ"*",ÿ"/*",ÿ1)

    .ÿ
    .ÿsplitÿx1a,ÿgenerate(x2)ÿparse(/)ÿdestring
    variablesÿbornÿasÿstring:ÿ
    x21ÿÿx22
    x21:ÿallÿcharactersÿnumeric;ÿreplacedÿasÿdouble
    x22:ÿcontainsÿnonnumericÿcharacters;ÿnoÿreplace

    .ÿ
    .ÿlistÿx1ÿx2?,ÿnoobsÿseparator(0)

    ÿÿ+----------------------+
    ÿÿ|ÿÿÿÿÿÿx1ÿÿÿÿx21ÿÿÿx22ÿ|
    ÿÿ|----------------------|
    ÿÿ|ÿ1.58***ÿÿÿ1.58ÿÿÿ***ÿ|
    ÿÿ|ÿÿÿÿÿ.48ÿÿÿÿ.48ÿÿÿÿÿÿÿ|
    ÿÿ|ÿÿ.82***ÿÿÿÿ.82ÿÿÿ***ÿ|
    ÿÿ|ÿÿÿÿ.51*ÿÿÿÿ.51ÿÿÿÿÿ*ÿ|
    ÿÿ|ÿÿ.87***ÿÿÿÿ.87ÿÿÿ***ÿ|
    ÿÿ|ÿÿÿ-.47*ÿÿÿ-.47ÿÿÿÿÿ*ÿ|
    ÿÿ|ÿÿ.45***ÿÿÿÿ.45ÿÿÿ***ÿ|
    ÿÿ|ÿÿÿ.14**ÿÿÿÿ.14ÿÿÿÿ**ÿ|
    ÿÿ|ÿÿÿÿÿ.17ÿÿÿÿ.17ÿÿÿÿÿÿÿ|
    ÿÿ|ÿÿÿÿ1.11ÿÿÿ1.11ÿÿÿÿÿÿÿ|
    ÿÿ+----------------------+

    .ÿ
    .ÿexit

    endÿofÿdo-file


    .

    Comment


    • #3
      Using regular expressions:

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str7 x1 float x2 str3 x3
      "1.58***" 1.58 "***"
      ".48"      .48 ""  
      ".82***"   .82 "***"
      ".51*"     .51 "**"
      ".87***"   .87 "***"
      "-.47*"   -.47 "*"  
      ".45***"   .45 "***"
      ".14**"    .14 "**"
      ".17"      .17 ""  
      "1.11"    1.11 ""  
      end
      gen wanted1= real(ustrregexra(x1, "\*", "", .))
      gen wanted2= ustrregexra(x1, "[0-9\.-]", "", .)
      Res.:

      Code:
      . l, sep(11)
      
           +------------------------------------------+
           |      x1     x2    x3   wanted1   wanted2 |
           |------------------------------------------|
        1. | 1.58***   1.58   ***      1.58       *** |
        2. |     .48    .48             .48           |
        3. |  .82***    .82   ***       .82       *** |
        4. |    .51*    .51    **       .51         * |
        5. |  .87***    .87   ***       .87       *** |
        6. |   -.47*   -.47     *      -.47         * |
        7. |  .45***    .45   ***       .45       *** |
        8. |   .14**    .14    **       .14        ** |
        9. |     .17    .17             .17           |
       10. |    1.11   1.11            1.11           |
           +------------------------------------------+

      Comment


      • #4
        Also

        Code:
        gen wanted1 = cond(strpos(x1, "*"), substr(x1, 1, strpos(x1, "*") - 1), x1)
        gen wanted2 = substr(x1, strpos(x1, "*"), .)

        Comment


        • #5
          Thank you Nick Cox
          Thank you Andrew Musau
          ​​​​​​​thank you Joseph Coveney
          ​​​​​​​

          Comment

          Working...
          X