Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Keeping observations that start with specific digits

    Hello,

    How can I keep observations that start with a certain set of numbers. Specifically, I want to keep observations that start with "3344" within a list of mostly 6-digit numbers. I found a previous post (http://www.stata.com/statalist/archi.../msg01191.html) but for some reason these options aren't working. I've tried the following but stata says I’m using invalid syntax.

    keep if substr(string(naics02), 1, 2, 3, 4) == "3344"

    I then tried using floor but it dropped all of my data (when in fact there are observations that start with 3344):

    keep if floor(naics02/10000) == 3344

    Please help! Thank you!

    Alyse

  • #2
    The first is a syntax problem. substr() has three arguments, the string expression, the start of the substring and the length of the substring. Hence what you need is something more like

    Code:
    keep if substr(string(naics02), 1, 4) == "3344"


    See

    Code:
    help substr() 


    The second is an arithmetic problem. If you want numbers like 3344?? to be truncated to 3344 you need a different divisor:

    Code:
    keep if floor(naics02/100) == 3344


    as you want to lop off the last two digits. Check

    Code:
     
    di floor(334499/100) 
    3344
    but if some numbers are not 6 digits this may not deliver what you want in such cases. Hence the first solution is likely to be better.

    Comment


    • #3
      Excellent - the following code worked. Thanks!
      keep if substr(string(naics02), 1, 4) == "3344"

      Comment

      Working...
      X