Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generate dummy based on part of the value of a string variable

    Dear reader,

    I am dealing with a database containing executives and their title. Two string variables are attached: title and titleann, some examples of the string values of the variables:

    title
    group vp-structures & systems, maintenance, repair and overhaul
    chairman, president & CEO
    vp, general counsel & secretary
    vp & chief finance officer
    group vp-aviation supply chain
    group vp-structures & systems, maintenance, repair and overhaul
    president & CEO
    president & chief operating officer
    executive vp & chief finance officer
    senior vp; president-Integrated Solutions Group
    chairman, president & CEO
    vp-worldwide operations
    titleann
    group vp-structures & systems, maintenance, repair and overhaul
    Chairman, Chief Executive Officer and Chairman of Executive Committee
    vp, general counsel & secretary
    President, Chief Operating Officer of Expeditionary Services and Director
    Vice President of Commercial Development
    group vp-structures & systems, maintenance, repair and overhaul
    president & CEO
    president & chief operating officer
    Chairman, Chief Executive Officer and President
    senior vp; president-Integrated Solutions Group
    chairman, president & CEO
    vp-worldwide operations
    I want to generate a dummy variable being equal to 1 or "CFO" when the text includes "chief financial officer"/"principal financial officer"/"chief finance officer". Under variable title, "executive vp & chief finance officer" is one observation. The term I am searching for may be part of a string variable, rather than the only value in that specific cell.

    Does anyone know how to generate such a dummy from a string variable containing more than only the item I am searching for?


    Kind regards,

    Ilse

  • #2
    See help strpos. E.g.
    Code:
    gen isCFO = 1 if strpos(title, "chief financial officer") > 0 | strpos(...) ...

    Comment


    • #3
      Jesse Wursten's advice will do what Ilse de Groof asked for, quite literally. But it may not be what she really wants. The code in #2 generates a variable coded 1 for true and missing value for false. That kind of variable is difficult to work with in Stata since, when used in any boolean expression, Stata treats missing value as true. Stata is really set up to use boolean variables coded 0 for false and 1 for true. To do that, the code would be:

      Code:
       gen isCFO = strpos(title, "chief financial officer") > 0 | strpos(...) ...
      Sometimes we actually want a trichotomy: 0/1/missing value where missing value denotes "can't be evaluated as true or false," for example, in an observation where the variable title is itself missing. So that would look like:

      Code:
       gen isCFO = strpos(title, "chief financial officer") > 0 | strpos(...) ...    if !missing(title)

      Comment

      Working...
      X