Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Possible bug with regex functions

    I believe there might be a bug with Stata's regex processor. In particular, the regex search pattern "\$" does not search for literal representations of the dollar sign but rather analyzes "\$" as an anchor. E.g. the command below returns the following

    Code:
    disp ustrregexra("\$ asdf", "\$", "dog")
    
    $ asdfdog
    When it should instead return

    Code:
    dog asdf

  • #2
    Code:
    disp ustrregexra("$ asdf", "[$]", "dog")
    
    dog asdf
    Code:
    . disp ustrregexra("$ asdf", "\\$", "dog")
    dog asdf
    Last edited by Bjarte Aagnes; 23 Dec 2022, 10:38.

    Comment


    • #3
      Originally posted by Bjarte Aagnes View Post
      Code:
      disp ustrregexra("$ asdf", "[$]", "dog")
      
      dog asdf
      Code:
      . disp ustrregexra("$ asdf", "\\$", "dog")
      dog asdf
      Thanks for the help. Is there a specific implementation of regex that Stata is modeled after (e.g. PCRE, PCRE2, Java)? I'm not familiar with any other implementation that requires two backslashes to literally search the dollar sign character.

      Comment


      • #4
        Part of an explanation perhaps starts from another place: the role of $ for Stata originally was to indicate a global macro name, and later adding some support for regular expressions had to respect that.

        Comment


        • #5
          Originally posted by Nick Cox View Post
          Part of an explanation perhaps starts from another place: the role of $ for Stata originally was to indicate a global macro name, and later adding some support for regular expressions had to respect that.
          That makes sense, I wasn't aware that Stata's regex commands also supported macros.

          Comment


          • #6

            Comment

            Working...
            X