Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Repeating digits in phone number

    In phone number 888-331-2222 in this format, I want to detect phone numbers who have repeating digits in succeeding order, for example if any phone number has 000 or 999 or 222 in them, and how many times they have them consecutively.


  • #2
    I think we need a sharper formulation. For example, repeating digits could mean two identical digits in order; are you just interested in three or more? Would 112-223-3344 count as an example where identical digits are found in different blocks (i.e. should separators in written form be ignored)?

    Comment


    • #3
      Yes that's correct

      Comment


      • #4
        Hi Nick - I think I should elaborate. Yes, a code to count the number of triple digits in phone numbers, and the number of times/occurrences of triple digits

        Comment


        • #5
          So, this ignores "-" and then looks for one or more instances of "000", "111", ... "999".


          Code:
          clear
          input str12 numbers
          888-331-2222
          112-223-3344
          111-111-1111
          end
          
          clonevar work = numbers
          replace work = subinstr(work, "-", "", .)
          gen found = ""
          
          quietly forval i = 0/9 {
              local seek = 3 * "`i'"  
              count
              while r(N) > 0 {
                  replace found = cond(missing(found), "`seek'", found + " `seek'") if strpos(work, "`seek'")
                  replace work = subinstr(work, "`seek'", "", 1)
                  count if strpos(work, "`seek'")
              }
          }
          
          gen nseek = wordcount(found)
          list
              
               +-------------------------------------------+
               |      numbers   work         found   nseek |
               |-------------------------------------------|
            1. | 888-331-2222   3312       222 888       2 |
            2. | 112-223-3344   1144       222 333       2 |
            3. | 111-111-1111      1   111 111 111       3 |
               +-------------------------------------------+
          Here's a predictable variant on the problem, looking for longest possible runs with 3 or more of the same digit:

          Code:
          clear
          input str12 numbers
          888-331-2222
          112-223-3344
          111-111-1111
          end
          
          clonevar work = numbers
          replace work = subinstr(work, "-", "", .)
          gen found = ""
          
          quietly forval i = 0/9 {
              forval j = 10(-1)3 {
                  local seek = `j' * "`i'"  
                  count
                  while r(N) > 0 {
                      replace found = cond(missing(found), "`seek'", found + " `seek'") if strpos(work, "`seek'")
                      replace work = subinstr(work, "`seek'", "", 1)
                      count if strpos(work, "`seek'")
                  }    
              }
          }
          
          gen nseek = wordcount(found)
          list
              
               +------------------------------------------+
               |      numbers   work        found   nseek |
               |------------------------------------------|
            1. | 888-331-2222    331     2222 888       2 |
            2. | 112-223-3344   1144      222 333       2 |
            3. | 111-111-1111          1111111111       1 |
               +------------------------------------------+
          Last edited by Nick Cox; 11 Oct 2018, 01:59.

          Comment


          • #6
            Thanks this is beyond helpful!!!

            Can you explain this part of the code please
            while r(N) > 0 { replace found = cond(missing(found), "`seek'", found + " `seek'") if strpos(work, "`seek'") replace work = subinstr(work, "`seek'", "", 1) count if strpos(work, "`seek'")

            Comment


            • #7
              The count command counts observations and leaves a result in r(N). It looks for matches. If there are none, the result is 0 and the while loop terminates. Otherwise we go round again.

              Suppose we are looking for 111. If we find it, we add that to what has been found already, but there may still be other instances of 111 later in the phone number.

              It's important that we delete a string once found from a working copy of the original variable. Otherwise we would just be finding the same string every time we look.

              See [U] on while loops or any introduction to programming.

              Comment


              • #8
                What is this part for replace work = subinstr(work, "-", "", .)

                Comment


                • #9
                  It finds the "-" in the variable work and replaces them with nothing (i.e. ""). This because in response #3 you indicated that the separators ("-") should be ignored and it didn't matter in which block the repeating digits were found.

                  Comment

                  Working...
                  X