Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Condition based on part of a string

    Hi,
    I want to execute some commands only on those observations that contain a specific phrase as part of their value in a string variable. What should be the condition?
    P.S. the phrase isn't in English (Hebrew letters).
    Thanks ahead,
    Ben

  • #2
    Hi Ben,

    The condition is up to you to come up with. One way you could implement this would be along the lines of:

    Code:
    gen test = 1 if strpos("string_to_look_for",var_you_are_looking_for_string)>0
    What this does is it applies whatever you are doing (in this case, generating test=1) only to those observations in which the value for variable var_you_are_looking_for_string contains the string string_to_look_for. I'm not sure how Stata deals with Hebrew characters though.

    Cheers
    Last edited by Igor Paploski; 15 Jul 2019, 12:41.

    Comment


    • #3
      I imagine that Hebrew characters are implemented in unicode, and so the condition proposed in #2 would use the -ustrpos()- function instead of -strpos()-.

      Comment


      • #4
        Thank you, Igor and Clyde.

        Actually, none of them works - it only get me those observations with an empty value in the specified variable (regardless the variable I specify and the language it holds, as I got the same results with English strings).

        Do you have any idea what could be the cause?

        Comment


        • #5
          Hey Ben, help strpos shows explains that the order I posted the suggestion to you is inverted.

          Instead of:
          Code:
          gen test = 1 if strpos("string_to_look_for",var_you_are_looking_for_string)>0
          It should be:
          Code:
          gen test = 1 if strpos(var_you_are_looking_for_string, "string_to_look_for")>0
          This way, if you run the following:
          Code:
          sysuse auto.dta
          gen buick = 1 if strpos(make,"Buick")>0
          You will create a variable that is 1 for all entries that contain the word "Buick" (Stata is case-sensitive, "buick" would not be detected) in the variable make.

          Comment

          Working...
          X