Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generate New Variable that contains Character

    Hi,

    I wan't to generate a variable that includes the characters that match "AAH" "AHC" "AMG" "AURORA". My code thus far is - gen facility=1 if strpos(_FACILITYNAME, "AAH" "AHC" "AMG" "Aurora" "AURORA").

    . tab _FACILITYNAME

    FACILITYNAME | Freq. Percent Cum.
    ----------------------------------------+-----------------------------------
    16th St. Community Center-4512 | 1 0.27 0.27
    16th St. Community Clinic (2)-1005 | 5 1.35 1.62
    16th St. Community Health Cntr-Pkwy-1.. | 4 1.08 2.70
    16th St.Community Health Ctr-Chavez-4.. | 3 0.81 3.51
    AAH BROOKFIELD BLUEMOUND | 2 0.54 4.05
    AAH GOOD HOPE ROAD CLINIC | 1 0.27 4.32
    AAH MAYFAIR ROAD CLINIC | 1 0.27 4.59
    AAH MILWAUKEE WEST CLINIC | 2 0.54 5.14
    AAH NEW BERLIN CLINIC | 1 0.27 5.41
    AAH WOMENS CARE FRANKLIN | 1 0.27 5.68
    ACL Aurora MC Grafton I/F | 1 0.27 5.95
    ACL Central/West Allis Mh | 1 0.27 6.22
    ACL Laboratories | 34 9.19 15.41
    ACL/AMG WEST ALLIS CENTRL LAB-AMG | 1 0.27 15.68
    AHC/AAH NEW BERLIN PSC-AAH | 2 0.54 16.22
    AHC/AAH WA PSC - AAH | 2 0.54 16.76
    AHC/AMG DE PERE | 1 0.27 17.03
    AHC/AMG EDGERTON | 9 2.43 19.46
    AHC/AMG EDGERTON HEALTH CTR | 9 2.43 21.89
    AHC/AMG OSHKOSH-WESTHAVEN | 1 0.27 22.16
    AHC/AMG RACINE EAST | 1 0.27 22.43
    AHC/AMG US BANK | 1 0.27 22.70
    AHC/AMG WEST ALLIS FIREHOUSE SQ | 1 0.27 22.97
    AHC/AMG WILKINSON MED CLN SUMMIT | 1 0.27 23.24
    AHC/AUW WALKERS POINT | 1 0.27 23.51
    AHC/AUW WOMENS HEALTH CENTER | 2 0.54 24.05
    AURORA ADVANCED HEALTHCARE FWC | 1 0.27 24.32
    AURORA ADVANCED HEALTHCARE NB | 1 0.27 24.59
    AURORA ADVANCED HEALTHCARE RD | 1 0.27 24.86
    AURORA ADVANCED HEALTHCARE WCC | 1 0.27 25.14
    AURORA HEALTHCARE | 7 1.89 27.03
    AURORA MEDICAL CENTER GRAFTON | 2 0.54 27.57
    AURORA MEDICAL GROUP | 1 0.27 27.84


    Above, this is the data set I'm working with.

    . tab facility

    facility | Freq. Percent Cum.
    ------------+-----------------------------------
    1 | 12 100.00 100.00
    ------------+-----------------------------------
    Total | 12 100.00


    I only get 12 values for that code when there are clearly more containing those characters mentioned above. What options can I include to capture all those values with the characters or am I just using the wrong code.

    Using STATA 14.1

    Edit: Sorry the above tables are formatting where it's hard to read the tables

  • #2
    This is a little hard to follow but I think you want something like this:


    Code:
    gen facility = 0 
    
    quietly foreach code in  AAH  AHC AMG Aurora AURORA { 
         replace facility = 1 if strpos(FACILITYNAME, "`code'") 
    }

    Comment


    • #3
      There are a couple of problems. First, you can't cram all those different strings into the second argument of -strpos()- They have to be done one at a time. Second, it is usually better to create a 0/1 variable rather than a missing/1 variable. So try this:

      Code:
      gen facility = 0
      foreach x in AAH AHC AMG Aurora AURORA {
          replace facility = 1 if strpos(FACILITYNAME, "`x'")
      }
      As for the poor formatting of the table, you should read FAQ #12 which includes, among other good advice on how to post well, instructions on the best way to post Stata output, which is between code delimiters (just as I have done with my Stata code above.)

      Added: Nick beat me to the punch again! Same solution.

      Comment


      • #4
        Clyde has the better explanation this time, with not just code but a story of why you should write it that way.

        Comment


        • #5
          Thank you for the suggestions.

          Comment

          Working...
          X