Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How can I extract the one particular strings from several strings?

    Hi all,

    I have some addresses separated by ";" in the following format:

    “ Univ London Imperial Coll Sci Technol & Med, Fujitsu Parallel Comp Res Ctr, London SW7 2BZ, England; Coventry Univ, Sch Math & Informat Sci, Coventry CV1 5FB, W Midlands, England; City Univ Hong Kong, Dept Mfg Engn & Engn, Kowloon, Peoples R China”

    How can I extract the address only in People R China, that is "City Univ Hong Kong, Dept Mfg Engn & Engn, Kowloon, Peoples R China"?

    Of course, the string with "People R China" could appear in the first string, in the middle, and in the last.

    Many thanks!



  • #2
    Code:
    cls
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str230 address
    "Univ London Imperial Coll, Fujitsu Parallel Comp Res Ctr, London SW7 2BZ, England; Coventry Univ, Sch Math & Informat Sci, Coventry CV1 5FB, W Midlands, England; City Univ Hong Kong, Dept Mfg Engn & Engn, Kowloon, Peoples R China"
    end
    generate id = _n
    split address, parse(";") generate(addr)
    drop address
    list
    reshape long addr, i(id) j(num)
    keep if strpos(addr,"Peoples R China")>0
    list, clean noobs
    Code:
    . generate id = _n
    
    . split address, parse(";") generate(addr)
    variables created as string:
    addr1  addr2  addr3
    
    . drop address
    
    . list
    
         +----------------------------------------------------------------------------------------+
      1. | id |                                                                             addr1 |
         |  1 | Univ London Imperial Coll, Fujitsu Parallel Comp Res Ctr, London SW7 2BZ, England |
         |----------------------------------------------------------------------------------------|
         |                                                                              addr2     |
         |      Coventry Univ, Sch Math & Informat Sci, Coventry CV1 5FB, W Midlands, England     |
         |----------------------------------------------------------------------------------------|
         |                                                                         addr3          |
         |           City Univ Hong Kong, Dept Mfg Engn & Engn, Kowloon, Peoples R China          |
         +----------------------------------------------------------------------------------------+
    
    . reshape long addr, i(id) j(num)
    (j = 1 2 3)
    
    Data                               Wide   ->   Long
    -----------------------------------------------------------------------------
    Number of observations                1   ->   3          
    Number of variables                   4   ->   3          
    j variable (3 values)                     ->   num
    xij variables:
                          addr1 addr2 addr3   ->   addr
    -----------------------------------------------------------------------------
    
    . keep if strpos(addr,"Peoples R China")>0
    (2 observations deleted)
    
    . list, clean noobs
    
        id   num                                                                   addr  
         1     3    City Univ Hong Kong, Dept Mfg Engn & Engn, Kowloon, Peoples R China  
    
    .
    In my copy of your example data, I shortened one of the addresses slightly so that the dataex command could deal with the string. It has no effect on the code.

    Comment


    • #3
      Many thanks for your help! It works well!

      Comment

      Working...
      X