Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Parsing a string variable to extract a particular word

    Hello everyone,

    I need to extract the word INC or CORPORATION or INCORPORATED from the string variable provided below and code it as a new variable for that observation. The new variable inc will take the value 1 if the variable cname has any of the bold colored words mentioned above and otherwise zero.

    I've been very well advised on somewhat similar issue on statalist , therefore asking for help another time since this type of problem is kind of out of my reach! Would really appreciate any kind suggestion!


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input strL cname
    "ANTHONY G. FUSSNER"                 
    "BRIAN L. PETREQUIN, ESQ."           
    "BRIAN SIRITZKY"                     
    "MELISSA W. ACOSTA"     
    "FULWIDER PATTON LLP"                
    "MARGARET POLSON" 
    "AVENTIS PHARMACEUTICALS, INC." 
    "CLARIANT CORPORATION" 
    "HENKEL CORPORATION" 
    "VOLPE AND KOENIG, P.C."                 
    "SUSAN J. LILYQUIST PARALEGAL" 
    "TEXAS INSTRUMENTS INCORPORATED"
    "3M INNOVATIVE PROPERTIES CO."           
    "HEWLETT-PACKARD CO."                    
    "THOMSON MULTIMEDIA LICENSING, INC."  
    "PATENT DOCUMENTATION CENTER/XEROX CORP."
    "ANOVA LAW GROUP, PLLC"                   
    "ASIA VITAL COMPONENTS CO., LTD" 
    "DOW CORNING CORPORATION" 
    "KIDS II, INC."
    "JAMES M. STOVER TERADATA CORPORATION"
    "NXP USA, INC. LAW DEPARTMENT" 
    end

  • #2
    Code:
    gen byte inc = ustrregexm(cname,"\b(INC\.|CORPORATION|INCORPORATED)")
    so you get:
    Code:
      +-----------------------------------------------+
      |                                   cname   inc |
      |-----------------------------------------------|
      |                      ANTHONY G. FUSSNER     0 |
      |                BRIAN L. PETREQUIN, ESQ.     0 |
      |                          BRIAN SIRITZKY     0 |
      |                       MELISSA W. ACOSTA     0 |
      |                     FULWIDER PATTON LLP     0 |
      |                         MARGARET POLSON     0 |
      |           AVENTIS PHARMACEUTICALS, INC.     1 |
      |                    CLARIANT CORPORATION     1 |
      |                      HENKEL CORPORATION     1 |
      |                  VOLPE AND KOENIG, P.C.     0 |
      |            SUSAN J. LILYQUIST PARALEGAL     0 |
      |          TEXAS INSTRUMENTS INCORPORATED     1 |
      |            3M INNOVATIVE PROPERTIES CO.     0 |
      |                     HEWLETT-PACKARD CO.     0 |
      |      THOMSON MULTIMEDIA LICENSING, INC.     1 |
      | PATENT DOCUMENTATION CENTER/XEROX CORP.     0 |
      |                   ANOVA LAW GROUP, PLLC     0 |
      |          ASIA VITAL COMPONENTS CO., LTD     0 |
      |                 DOW CORNING CORPORATION     1 |
      |                           KIDS II, INC.     1 |
      |    JAMES M. STOVER TERADATA CORPORATION     1 |
      |            NXP USA, INC. LAW DEPARTMENT     1 |
      +-----------------------------------------------+

    Comment


    • #3
      Thanks a ton, Mr. Kumar. It worked smoothly. Obliged to have this gracious help with my coding!

      Comment

      Working...
      X