Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How does the -rolling- command's -keep- option actually work?

    I'm having trouble understanding how the -keep- option of the -rolling- command actually works. For example, this code

    Code:
    sysuse gnp96, clear
    rolling s = r(sum), window(4) keep(date) clear: summ gnp96
    produces output like this:

    Code:
        start      end     date         s  
        1967q1   1967q4   1974q3   14651.2  
        1967q2   1968q1   1974q4   14777.1  
        1967q3   1968q2   1975q1   14950.9  
        1967q4   1968q3   1975q2   15120.5  
        1968q1   1968q4   1975q3   15279.2  
        1968q2   1969q1   1975q4   15428.5  
        1968q3   1969q2   1976q1   15525.2  
        1968q4   1969q3   1976q2   15621.4  
        1969q1   1969q4   1976q3   15682.5  
        1969q2   1970q1   1976q4   15697.7
    etc.
    According to the documentation, I expected the -keep- option to preserve the -date- variable from the end of the window, so I expected it to match the -end- variable. Instead, it's many periods ahead, and after the observation where start == 1994q4, the -date- variable is missing. What's going on here?

    Furthermore, in this example:

    Code:
    freduse GS10, clear
    drop if year(daten) < 2005
    replace daten = mofd(daten)
    format %tm daten
    tsset daten
    
    rolling s = r(sum), window(5) keep(daten) clear: summ GS10
    list in 1/10, clean noobs
    the output is like this

    Code:
          start       end   date       s  
         2005m1    2005m5      .   21.37  
         2005m2    2005m6      .   21.15  
         2005m3    2005m7      .   21.16  
         2005m4    2005m8      .   20.92  
         2005m5    2005m9      .   20.78  
         2005m6   2005m10      .    21.1  
         2005m7   2005m11      .   21.64  
         2005m8   2005m12      .   21.93  
         2005m9    2006m1      .   22.09  
        2005m10    2006m2      .   22.46  
    etc.
    Once again, I would expect the kept variable to match the ending date, but instead it's missing and the name doesn't seem right (-date- vs -daten-). How do I actually use this option?

    I want to carry my date variable through for merging purposes, because the carried-through date variable should match the end of the window.
    Last edited by Michael Anbar; 18 Dec 2014, 15:15.

  • #2
    I've duplicated the results in the gnp96 example using Stata/SE 13.1 for Mac (64-bit Intel) Revision 07 Nov 2014.

    Using -format date %9.0g- before the -rolling- is instructive; it displays the values underlying the formatted quarters.
    Code:
        start   end   date         s  
           28    31     58   14651.2  
           29    32     59   14777.1 
           30    33     60   14950.9  
           31    34     61   15120.5  
           32    35     62   15279.2  
           33    36     63   15428.5  
           34    37     64   15525.2  
           35    38     65   15621.4  
           36    39     66   15682.5  
           37    40     67   15697.7
    It appears that the end values are all too large by 27 = start-1. This doesn't seem like what the documentation suggested, but I'm a newbie at this.

    With that said, would using
    Code:
    replace date=end
    ​after the -rolling- solve your immediate problem, in the absence of a more definitive answer to the question posed in the title of your post?

    Comment


    • #3
      After further study, I believe that where rolling.ado intended, for this example, to do something akin to
      Code:
      gen kept_date = date[_n+4-1]
      it instead did the equivalent of
      Code:
      gen kept_date = date[date[n]]+4-1
      which explains both the offset and the missing values for date in the final rows of the table it created.

      But I also think using keep(date) may be trying too hard; the start and end values seem designed for the purpose at hand.
      Code:
      sysuse gnp96, clear
      rolling s = r(sum), window(4) clear: summ gnp96
      rename end date
      drop start
      seems to do what is required, as I understand it. Perhaps the keep option for -rolling- was not intended to be used with the timevar established by tsset, and is leading to the unintended result we observed.

      Comment


      • #4
        Michael has discovered a bug in rolling's logic for retrieving the values of the keep() variable.

        William's analysis of the situation is spot-on.

        We should have a fix for this sometime after the Holiday break.

        Comment


        • #5
          Excellent, Jeff. Thanks for letting me know. (And thanks to William for the great analysis).

          Comment


          • #6
            I note that the bug in rolling also existed in Stata/SE 13.1 for Mac (64-bit Intel) Revision 19 Dec 2014 (released during the discussion above), and there have been no subsequent updates to Stata 13 prior to the release of Stata 14.

            Perhaps this problem was resolved in Stata 14.

            Comment


            • #7
              The fix to rolling is already in Stata 14.

              rolling will be fixed in the next Stata 13 update.

              Comment


              • #8
                Indeed, this problem was fixed in Stata/SE 13.1 for Mac (64-bit Intel) Revision 17 Apr 2015.

                Comment

                Working...
                X