Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • maximum time gap

    Dear All, I found this question somewhere, and appreciate if anyone can give some suggestions. The data is
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float id str12 ymd float n
    1 "2008.10.01" 1
    1 "2008.12.15" 2
    1 "2009.01.10" 3
    1 "2009.05.01" 4
    2 "2008.08.15" 1
    2 "2008.10.01" 2
    2 "2008.12.15" 3
    2 "2008.12.31" 4
    2 "2009.01.31" 5
    2 "2009.05.31" 6
    2 "2009.08.31" 7
    end
    For each id, I'd like to calculate the time between any -ymd- (I have done: sort id ymd). For instance, for id=1, I need to calculate times between any combinations of ymd, i.e., 2008.12.15-2008.10.01; 2009.01.10-2008.10.01; 2009.05.01-2008.10.01; 2009.01.10-2008.12.15; 2009.05.01-2008.12.15; and 200.05.01-2009.01.10). If the maximum time gap is falling between 30-365 (in terms of days), then this id is what I want. Any suggestions?
    Ho-Chuan (River) Huang
    Stata 19.0, MP(4)

  • #2
    First you need to convert ymd to a Stata internal format date (call it date). If you sort the data on date within id, then the maximum time gap will always be the gap between the first and last dates. That's a one step calculation. Then you just compare it to 30 and 365.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float id str12 ymd float n
    1 "2008.10.01" 1
    1 "2008.12.15" 2
    1 "2009.01.10" 3
    1 "2009.05.01" 4
    2 "2008.08.15" 1
    2 "2008.10.01" 2
    2 "2008.12.15" 3
    2 "2008.12.31" 4
    2 "2009.01.31" 5
    2 "2009.05.31" 6
    2 "2009.08.31" 7
    end
    
    gen date = daily(ymd, "YMD")
    format date %td
    assert missing(date) == missing(ymd)
    
    by id (date), sort: gen gap = date[_N] - date[1]
    gen byte wanted = inrange(gap, 30, 365)
    Is this too simple? Have I missed something? I don't see any need to actually make all pairwise comparisons to do this.

    Comment


    • #3
      Dear Clyde, Thank you for the reply. I re-read the original question, and find that I mis-stated the question. The original question asked to construct a dummy for, in my illustrative example above, ANY (not the maximum) one of the six pairs falling between 30 to 365 days.

      Ho-Chuan (River) Huang
      Stata 19.0, MP(4)

      Comment


      • #4
        You can use rangestat (from SSC) to count, for each observation, the number of observations within the same group (id) that fall within the desired window. You don't say if the window is forward or backward looking so I'll assume both are possible. I'll also assume that there are no missing dates since handling missing values would add some extra code.

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input float id str12 ymd float n
        1 "2008.10.01" 1
        1 "2008.12.15" 2
        1 "2009.01.10" 3
        1 "2009.05.01" 4
        2 "2008.08.15" 1
        2 "2008.10.01" 2
        2 "2008.12.15" 3
        2 "2008.12.31" 4
        2 "2009.01.31" 5
        2 "2009.05.31" 6
        2 "2009.08.31" 7
        3 "2009.05.30" 1
        3 "2009.08.31" 2
        3 "2011.09.30" 3
        end
        gen date = daily(ymd, "YMD")
        format date %td
        
        assert !mi(date,id)
        
        rangestat (count)  over=id, by(id) interval(date 30 365)
        rangestat (count) under=id, by(id) interval(date -365 -30)
        
        gen inwindow = over < .  | under < .
        
        sort id date
        list, sepby(id)
        and the results:
        Code:
        . list, sepby(id)
        
             +-----------------------------------------------------------+
             | id          ymd   n        date   over   under   inwindow |
             |-----------------------------------------------------------|
          1. |  1   2008.10.01   1   01oct2008      3       .          1 |
          2. |  1   2008.12.15   2   15dec2008      1       1          1 |
          3. |  1   2009.01.10   3   10jan2009      1       1          1 |
          4. |  1   2009.05.01   4   01may2009      .       3          1 |
             |-----------------------------------------------------------|
          5. |  2   2008.08.15   1   15aug2008      5       .          1 |
          6. |  2   2008.10.01   2   01oct2008      5       1          1 |
          7. |  2   2008.12.15   3   15dec2008      3       2          1 |
          8. |  2   2008.12.31   4   31dec2008      3       2          1 |
          9. |  2   2009.01.31   5   31jan2009      2       4          1 |
         10. |  2   2009.05.31   6   31may2009      1       5          1 |
         11. |  2   2009.08.31   7   31aug2009      .       5          1 |
             |-----------------------------------------------------------|
         12. |  3   2009.05.30   1   30may2009      1       .          1 |
         13. |  3   2009.08.31   2   31aug2009      .       1          1 |
         14. |  3   2011.09.30   3   30sep2011      .       .          0 |
             +-----------------------------------------------------------+
        
        .

        Comment


        • #5
          River has started a new thread with a variation on the initial question.

          Comment

          Working...
          X