Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Matching Specific Strings Using Strpos

    Hi,

    I'm attempting to create a check system for some data I'm collecting to make sure it flags certain errors. The questionnaire consists of a bunch of yes/no (1/0) questions (hc002s1-hc002s11), and then a sort of stem variable (hc002) which is a string consisting of which questions a respondent answered "yes" to.

    I'm hoping to loop through every variable and confirm that that specific question number appears in the stem variable if a respondent answered "yes". The problem I'm running into is that since double digit numbers also contain single digit numbers within them, the code will flag errors when there aren't any. For instance, if someone answered yes to the 10th question (hc002s10), but not the first (hc002s1), since 10 consists of the digits 1 & 0, the code flags it as an error since it sees the "1" within 10.

    Is there anyway to be highly specific with the strpos function, so it doesn't match smaller substrings within it?

    This is the code I created:

    foreach var of varlist hc002s1-hc002s11 {
    local n = substr("`var'", (strpos("`var'", "s") + 1), .)
    assert `var' == 1 if strpos(hc002, "`n'")
    }

    David


    Code:
    clear
    input int(hc002s1 hc002s2 hc002s3 hc002s4 hc002s5 hc002s6 hc002s7 hc002s8 hc002s9 hc002s10 hc002s11) str21 hc002
    0 1 1 0 0 1 0 0 0 0 0 "2-3-6"      
    0 1 1 0 0 0 0 1 0 0 0 "2-3-8"      
    0 1 1 0 0 0 0 1 0 0 0 "2-3-8"      
    0 1 0 1 0 0 0 0 0 0 0 "2-4"        
    0 1 1 1 0 0 0 1 0 1 0 "2-4-3-8-10" 
    0 1 0 1 1 0 0 0 0 0 0 "2-4-5"      
    0 1 0 1 1 0 0 0 0 0 0 "2-4-5"      
    0 1 0 0 1 0 0 0 0 0 0 "2-5"        
    0 1 0 0 1 0 0 0 0 0 0 "2-5"        
    1 1 0 0 1 0 0 0 0 0 0 "2-5-1"      
    1 1 0 0 1 0 0 0 0 0 0 "2-5-1"      
    0 1 0 1 1 1 0 0 0 0 0 "2-5-4-6"    
    0 1 0 0 1 1 0 0 0 0 0 "2-5-6"      
    0 1 0 0 0 1 0 0 0 0 0 "2-6"        
    0 1 1 0 0 1 0 0 0 0 0 "2-6-3"      
    0 1 0 1 0 1 0 0 0 1 0 "2-6-4-10"   
    0 1 0 0 0 0 0 1 0 0 0 "2-8"        
    0 1 1 0 0 0 0 1 0 0 0 "2-8-3"      
    0 1 0 0 0 1 0 1 0 0 0 "2-8-6"      
    1 1 1 1 1 1 0 0 0 0 0 "3-1-2-4-5-6"
    1 1 1 0 1 0 0 0 0 0 0 "3-1-2-5"    
    end

  • #2
    Here's one way to get your end result that simply avoids using -strpos()-. My intuition tells me there is something simpler, but I can't quite find it just now. I don't know of anyway to get -strpos()- to do what you ask.
    Code:
    rename hc002s# hc002s(##), renumber
    
    split hc002, gen(selection) parse("-") destring
    forvalues i = 1/11 {
        local ii: display %02.0f `i'
        egen temp = anymatch(selection*), values(`i')
        assert hc002s`ii' == 1 if temp
        drop temp
    }
    Added: By the way, if the renaming of those hc002s* variables breaks other code you have written, you can always reverse the renaming after this section of the code is done. -rename hc002s(##) hc002s(#)- will do that for you.
    Last edited by Clyde Schechter; 20 Jul 2023, 09:39.

    Comment


    • #3
      Try
      Code:
      forvalues n = 1/11 {
          assert hc002s`n' == 1 if strpos("-" + hc002 + "-",  "-`n'-")
      }

      Comment


      • #4
        Aha! This is perfect - works like a charm! Thanks so much Clyde!!

        Comment

        Working...
        X