Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • regexm

    Hi STATALIS

    I want to select patients with "IAM" in the studydescription_all variable . The "studydescription_all" variable is a text variable.
    I'm wondering why "CT Trauma C/A/P & T/L Spine C+" was selected?

    Regards,


    Code:
    gen Head_IAM=.
    replace Head_IAM=1 if (regexm(studydescription_all, ("[i/I][a/A][m/M]")))
    Code:
    studydescription_all                          Head_IAM
    CT Trauma C/A/P & T/L Spine C+                  1
    CT TempBone/IAM(+/-Brain) - C                    1
    CT IAM                                             1

  • #2
    You don't need -regexm()- to do this. Simpler is:

    Code:
    gen Head_IAM = (strpos(lower(studydescription_all), "iam") > 0)
    Also, it is poor practice, in Stata, to create indicator variables as 1/missing. That will land you in trouble sooner or later. Stata's logical functions are designed to work with 1/0 dichotomies. The code I have shown will create a 1/0 variable--these are easier and safer to use in Stata.

    Added: Almost forgot to answer your question about why your code was producing unintended results. The notation [i/I] is not correct. The / is interpreted as an actual character to match, so Stata is matching the / characters in the values of studyescription_all you show.
    Last edited by Clyde Schechter; 26 Mar 2019, 19:04.

    Comment


    • #3
      "[i/I]" means "match i, /, or I", meaning "/" is matched and your regex matches "/A/".

      I also agree with the post above that there is no reason to use regexes here and that "missing" for a true/false variable is a bad idea because "." evaluates to "true".

      Comment

      Working...
      X