Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • wildcard characters for egen

    hi,


    would like to count how many values 250.01 250.11 250.21 etc there are in my variables.


    I´m trying with this code, but it doesnt work on this way. My idea was use wilcard characters

    egen count_dm= rcount( C1 -C20 ) , c(@ == "250.*1")

    where * should be an wildcard... but this code doesnt work on this way. Any option?

    thanks!


  • #2
    rcount() is from egenmore (SSC), as you are asked to explain (FAQ Advice #12).

    Wildcards are supported for variable lists and as part of pattern matching or regular expression syntax, but your code has only one possible meaning, counting a literal match with the exact string specified.

    Assuming that the variables concerned are all string (as your code makes no sense otherwise), this should work, noting further that your example implies use of ? not *.

    Code:
    lear 
    input str6 (CM1 CM2 CM3)  
    frog toad newt 
    250.01 250.11 250.21 
    250.01 260.01 frog 
    end 
    
    gen count = 0 
    
    forval j = 1/3 { 
        replace count = count + strmatch(CM`j', "250.?1") 
    }
    
    list 
    
         +----------------------------------+
         |    CM1      CM2      CM3   count |
         |----------------------------------|
      1. |   frog     toad     newt       0 |
      2. | 250.01   250.11   250.21       3 |
      3. | 250.01   260.01     frog       1 |
         +----------------------------------+


    Comment


    • #3

      yes, I'm sorry. I should have read the FAQ better.
      I am afraid that my knowledge of syntax is still limited, but the syntax that you have given me works perfectly. Thank you very much.

      Comment


      • #4
        Originally posted by Paula Lopez View Post
        hi,


        would like to count how many values 250.01 250.11 250.21 etc there are in my variables.


        I´m trying with this code, but it doesnt work on this way. My idea was use wilcard characters

        egen count_dm= rcount( C1 -C20 ) , c(@ == "250.*1")

        where * should be an wildcard... but this code doesnt work on this way. Any option?

        thanks!
        In addition to the code Nick gave you, those look suspiciously like ICD-9 codes. If so, Stata has some ICD-9 and -10 related commands that may be very helpful. There's also an FAQ on how to convert string ICD-9 variables into numeric versions, how to use a loop and the ICD9 command to quickly generate exclusion or inclusion flags based on a range of codes ... there's probably more FAQs I don't know about as well.
        Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

        When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

        Comment

        Working...
        X