Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Any pair of time differences falling between 30-365 days?

    Dear All, In a previous thread (https://www.statalist.org/forums/for...ximum-time-gap), I mis-stated my question (as conjectured by Clyde Schechter). Thus, I re-state the question below (and hope this does not violate the regulation of this forum). The data set is:
    Code:
      
     * Example generated by -dataex-. To install: ssc install dataex clear input float id str12 ymd float n 1 "2008.10.01" 1 1 "2008.12.15" 2 1 "2009.01.10" 3 1 "2009.05.01" 4 2 "2008.08.15" 1 2 "2008.10.01" 2 2 "2008.12.15" 3 2 "2008.12.31" 4 2 "2009.01.31" 5 2 "2009.05.31" 6 2 "2009.08.31" 7 end
    For each id, I'd like to calculate the time between any pairs of -ymd- (I have done: sort id ymd). For instance, for id=1, I need to calculate times between any (actually 6) combinations of ymd, i.e., 2008.12.15-2008.10.01; 2009.01.10-2008.10.01; 2009.05.01-2008.10.01; 2009.01.10-2008.12.15; 2009.05.01-2008.12.15; and 200.05.01-2009.01.10). If any of the time differences is falling between 30-365 (in terms of days), then this id is what I want. Any suggestions?
    Ho-Chuan (River) Huang
    Stata 19.0, MP(4)

  • #2
    A simple change to Clyde's code, since the minimum gap will be between two successive observations.
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float id str12 ymd float n
    1 "2008.10.01" 1
    1 "2008.12.15" 2
    1 "2009.01.10" 3
    1 "2009.05.01" 4
    2 "2008.08.15" 1
    2 "2008.10.01" 2
    2 "2008.12.15" 3
    2 "2008.12.31" 4
    2 "2009.01.31" 5
    2 "2009.05.31" 6
    2 "2009.08.31" 7
    3 "2008.01.01" 1
    3 "2008.01.02" 2
    3 "2010.01.01" 3
    end
    
    gen date = daily(ymd, "YMD")
    format date %td
    assert missing(date) == missing(ymd)
    
    by id (date), sort: gen gap = date[_n] - date[_n-1]
    by id: egen byte wanted = max(inrange(gap, 30, 365))
    Code:
    . list, sepby(id)
    
         +------------------------------------------------+
         | id          ymd   n        date   gap   wanted |
         |------------------------------------------------|
      1. |  1   2008.10.01   1   01oct2008     .        1 |
      2. |  1   2008.12.15   2   15dec2008    75        1 |
      3. |  1   2009.01.10   3   10jan2009    26        1 |
      4. |  1   2009.05.01   4   01may2009   111        1 |
         |------------------------------------------------|
      5. |  2   2008.08.15   1   15aug2008     .        1 |
      6. |  2   2008.10.01   2   01oct2008    47        1 |
      7. |  2   2008.12.15   3   15dec2008    75        1 |
      8. |  2   2008.12.31   4   31dec2008    16        1 |
      9. |  2   2009.01.31   5   31jan2009    31        1 |
     10. |  2   2009.05.31   6   31may2009   120        1 |
     11. |  2   2009.08.31   7   31aug2009    92        1 |
         |------------------------------------------------|
     12. |  3   2008.01.01   1   01jan2008     .        0 |
     13. |  3   2008.01.02   2   02jan2008     1        0 |
     14. |  3   2010.01.01   3   01jan2010   730        0 |
         +------------------------------------------------+

    Comment


    • #3
      I always start reading new threads from the oldest to the newest. Since you did not indicate in the old thread that you were creating a new thread, I posted a solution on the old thread.

      William, I think River wants all pairwise combinations within each group (id) so your solution will fail to pick up gaps that are not consecutive. Here's an update on the example I posted in the previous thread:
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input float id str12 ymd float n
      1 "2008.10.01" 1
      1 "2008.10.15" 2
      1 "2008.10.31" 3
      1 "2008.11.15" 4
      2 "2008.08.15" 1
      2 "2008.10.01" 2
      2 "2008.12.15" 3
      2 "2008.12.31" 4
      2 "2009.01.31" 5
      2 "2009.05.31" 6
      2 "2009.08.31" 7
      3 "2009.05.30" 1
      3 "2009.08.31" 2
      3 "XXXX.XX.XX" 3
      3 "2011.09.30" 4
      end
      gen date = daily(ymd, "YMD")
      format date %td
      
      rangestat (count)  over=id, by(id) interval(date 30 365)
      rangestat (count) under=id, by(id) interval(date -365 -30)
      
      gen inwindow = over < .  | under < .
      
      * William's solution
      by id (date), sort: gen gap = date[_n] - date[_n-1]
      by id: egen byte wanted = max(inrange(gap, 30, 365))
      
      sort id date
      list, sepby(id)
      and the results
      Code:
      . list, sepby(id)
      
           +--------------------------------------------------------------------------+
           | id          ymd   n        date   over   under   inwindow   gap   wanted |
           |--------------------------------------------------------------------------|
        1. |  1   2008.10.01   1   01oct2008      2       .          1     .        0 |
        2. |  1   2008.10.15   2   15oct2008      1       .          1    14        0 |
        3. |  1   2008.10.31   3   31oct2008      .       1          1    16        0 |
        4. |  1   2008.11.15   4   15nov2008      .       2          1    15        0 |
           |--------------------------------------------------------------------------|
        5. |  2   2008.08.15   1   15aug2008      5       .          1     .        1 |
        6. |  2   2008.10.01   2   01oct2008      5       1          1    47        1 |
        7. |  2   2008.12.15   3   15dec2008      3       2          1    75        1 |
        8. |  2   2008.12.31   4   31dec2008      3       2          1    16        1 |
        9. |  2   2009.01.31   5   31jan2009      2       4          1    31        1 |
       10. |  2   2009.05.31   6   31may2009      1       5          1   120        1 |
       11. |  2   2009.08.31   7   31aug2009      .       5          1    92        1 |
           |--------------------------------------------------------------------------|
       12. |  3   2009.05.30   1   30may2009      1       .          1     .        1 |
       13. |  3   2009.08.31   2   31aug2009      .       1          1    93        1 |
       14. |  3   2011.09.30   4   30sep2011      .       .          0   760        1 |
       15. |  3   XXXX.XX.XX   3           .      .       .          0     .        1 |
           +--------------------------------------------------------------------------+
      
      .
      I also revise my previous post's obiter that missing dates would require special handling. This is one case where missing values in the key interval variable will not matter.

      Comment


      • #4
        Dear William, Thanks for your reply, and sorry for my prior mis-statement. As indicated by Robert in #3, I need to consider all pairwise combinations.

        Ho-Chuan (River) Huang
        Stata 19.0, MP(4)

        Comment


        • #5
          Dear Robert, As usual, you demonstrate again the power of your -rangestat- command. Thank you very much.

          Ho-Chuan (River) Huang
          Stata 19.0, MP(4)

          Comment

          Working...
          X