Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Counting Number of Occurrences by ID in Dataset

    Hello everyone,

    I am working with a dataset where each observation is a meeting that occurred at some point in time by a specific group. Hence, I have many groups that meet irregularly over time. I wish to create a counter variable that measures the number of times a particular group met in the previous, say, five years (thus excluding the "current" meeting). I must admit that I have no idea how to measure that.

    Here is an example of what I am working with (the real dataset has 50 groups that met for a total of 1500 times). I basically want to generate the variable nbOcc by syntax rather than by hand.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte groupID int year byte(month day) float(edate emonth) byte nbOcc
    1 1990  1 20 10977 360 0
    1 1990  5 12 11089 364 1
    1 1993  4 15 12158 399 2
    1 2000  7  2 14793 486 0
    2 1990  2 16 11004 361 0
    2 1990  3 23 11039 362 1
    2 1990  8 25 11194 367 2
    2 1991 10 12 11607 381 3
    2 1999  2 10 14285 469 0
    3 1992  1 11 11698 384 0
    3 1993  3  5 12117 398 1
    3 1994  5  6 12544 412 2
    3 1995  2 13 12827 421 3
    3 1996  1 16 13164 432 4
    end

    So, in this example, at line 1, group one is meeting for the first time, so there is no meeting in the previous 5 years. At line 3, group one has met twice in the previous five years, etc.

    I don't normally ask the Statalist for syntax without trying anything myself first, but I am truly confused as to how this would go. I think I need to sort by ID and then date (which is why I included edate and emonth, although emonth might be sufficient), and create a loop of some sort, but other than that I really don't know where I am going...

    Any help would be greatly appreciated!
    JP Gauvin

  • #2
    the following will work for the example you show but if you only want the previous 5 years worth and there are older occurrences this will not give you quite what you want:
    Code:
    sort id year month day
    by id: gen nbOcc=sum(1)
    replace nbOcc=nbOcc-1
    note that this could be shortened to fewer lines if wanted

    Comment


    • #3
      You can do this using rangestat (from SSC). To install it, type in Stata's command window:

      Code:
      ssc install rangestat
      I assume that 5 years @ 365.25 days is 1826 days. Here's how it would work with your sample data:

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input byte groupID int year byte(month day) float(edate emonth) byte nbOcc
      1 1990  1 20 10977 360 0
      1 1990  5 12 11089 364 1
      1 1993  4 15 12158 399 2
      1 2000  7  2 14793 486 0
      2 1990  2 16 11004 361 0
      2 1990  3 23 11039 362 1
      2 1990  8 25 11194 367 2
      2 1991 10 12 11607 381 3
      2 1999  2 10 14285 469 0
      3 1992  1 11 11698 384 0
      3 1993  3  5 12117 398 1
      3 1994  5  6 12544 412 2
      3 1995  2 13 12827 421 3
      3 1996  1 16 13164 432 4
      end
      format %td edate
      
      rangestat (count) meetings=edate, interval(edate -1827 -1) by(groupID)
      replace meetings = 0 if mi(meetings)
      list, sepby(groupID)
      and the results
      Code:
      . list, sepby(groupID)
      
           +----------------------------------------------------------------------+
           | groupID   year   month   day       edate   emonth   nbOcc   meetings |
           |----------------------------------------------------------------------|
        1. |       1   1990       1    20   20jan1990      360       0          0 |
        2. |       1   1990       5    12   12may1990      364       1          1 |
        3. |       1   1993       4    15   15apr1993      399       2          2 |
        4. |       1   2000       7     2   02jul2000      486       0          0 |
           |----------------------------------------------------------------------|
        5. |       2   1990       2    16   16feb1990      361       0          0 |
        6. |       2   1990       3    23   23mar1990      362       1          1 |
        7. |       2   1990       8    25   25aug1990      367       2          2 |
        8. |       2   1991      10    12   12oct1991      381       3          3 |
        9. |       2   1999       2    10   10feb1999      469       0          0 |
           |----------------------------------------------------------------------|
       10. |       3   1992       1    11   11jan1992      384       0          0 |
       11. |       3   1993       3     5   05mar1993      398       1          1 |
       12. |       3   1994       5     6   06may1994      412       2          2 |
       13. |       3   1995       2    13   13feb1995      421       3          3 |
       14. |       3   1996       1    16   16jan1996      432       4          4 |
           +----------------------------------------------------------------------+

      Comment


      • #4
        Thanks Rich!

        But as you mention, it only starts a counter for every meeting in that group. I want to be able to only count the previous 5 years. In my example, line 4 states that in 2000, group one had not met in the previous five years and so its counter was reset at zero.

        Any thoughts?

        Comment


        • #5
          See rangestat (SSC). There are now many posts now on this forum giving examples.

          I don't have time to check right now, but let's make 5 years concrete as 60 months and work with monthly dates. Then syntax would be something like

          Code:
          rangestat (count) edate, interval(emonth -59 0) by(groupID) excludeself
          EDIT: Compare a Picardesque answer 2 minutes earlier. Off for dinner.

          Comment


          • #6
            Thanks Robert and Nick! This is exactly what I needed, many thanks!
            Last edited by JP Gauvin; 03 Jun 2016, 13:19.

            Comment

            Working...
            X