Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • rolling regression, but with different number of observation in each window

    Hi Statalist colleagues,

    I am trying to run a rolling regression with regression window 2006-2010, 2007-2011, 2008-2012, ... 2016-2020=> 11 windows in total.

    Usually when I had annual data, I could just use stata command rolling, window().

    But the problem is that my dataset is not a annual data, but looks as follows with irregular number of data in each year.


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float date double(chg_pay sp500ret)
    16806     0  .002699353764334944
    16807 -3.06    .9642661927006246
    16813     0   -.6267879872783877
    16820     0    .5570752695287728
    16827     0    .7259659913377181
    16834     0   -.8795154476552325
    16835 -1.86   -.5284534048121459
    16841     0  -.14297327046401964
    16848     0    .7407178393196157
    16855     0    -.351535179790019
    16862     0  -.16139690217960023
    16869     0   -.4862526350060681
    16870   .89    .7346771008208686
    16876     0   .17956659475244496
    16883     0  -.25766760374676956
    16890     0  -.19727134079343678
    16897     0  -.16791975171757123
    16898   .83  -1.0336369214625885
    16904     0   .07735547418905586
    16911     0   .11973841298658083
    16918     0    .3579838662962631
    16925     0    .3403068087995287
    16926 -3.33   1.0314307163522995
    16932     0   -1.262520562446534
    16939     0   -.6645362921561926
    16946     0   1.1495829417599523
    16953     0   1.2329523359992711
    16954 -4.27   .19502276200795698
    16960     0   .14709459466515362
    16967     0    2.125841051352162
    16974     0   -.5043920367085564
    16981     0   2.1610077482290935
    16988     0    .2754370921263627
    16989 -1.33   -.6754211553848921
    16995     0  -1.2944842425878456
    17002     0   -.8475443658016668
    17009     0   -.3945381797991865
    17016     0   .16549140816095598
    17017  -.96  -.06988812790753585
    17023     0   .48091770998057726
    17030     0   .16310202500695592
    17037     0   .23962460327804358
    17044     0  -.03308182471537524
    17045   .15     .553265095135469
    17051     0   -.4761053916451097
    17058     0   -.1304631441617743
    17065     0   -.5172298374870543
    17072     0   .19346662334887021
    17079     0   .25006625790326975
    17080 -3.62   -.2685599897141011
    17086     0    .9549206042762659
    17093     0   .07702558203199761
    17100     0   .49725826937532247
    17107     0 -.007332058951659004
    17108 -1.35   -.2157402280056564
    17114     0   -.5120184856426069
    17121     0   .23123454381466235
    17127     0   .24223333011963444
    17135     0   .08351569993028107
    17142     0   -.3943097671421092
    17143  1.16   .18307521128408943
    17149     0    .8764006329381457
    17156     0   -.3442699506921887
    17163     0  -.14355526877846136
    17170     0   .12276079457049782
    17171  2.03   -.6087524804457067
    17177     0    .6351502020245148
    17184     0  -.29478393312007967
    17191     0    -1.12592255158096
    17198     0    .5595287287383011
    17199 -1.39   .16965360372751537
    17205     0  -.11576327838745959
    17219     0 -.062332570010925625
    17226     0   -.2579759755564126
    17233     0    .7193847085361194
    17234   .07   .06818373557433421
    17240     0   .37272142996862545
    17247     0  -.03524710561841893
    17254     0   .37823843877755614
    17261     0    .3344615129269535
    17268     0    .6217702105595135
    17275     0  -.11833738602358146
    17282     0  -.06852223261658574
    17289     0   .46165343950139714
    17290  -.49   .21729574543345453
    17296     0  -1.3793620770045867
    17303     0  -.08351759251497004
    17310     0   -.9582420409548487
    17317     0  .027849212081831887
    17318   .89     .373462463868135
    17324     0    -1.75270669093065
    17331     0   .48806865473609573
    17338     0    .6402575656946352
    17345     0   -.0400368612295976
    17352     0                    .
    17353   .41   .35911318175458895
    17359     0    1.906700336800049
    17366     0    .4490394080175486
    17373     0  -2.3334415236812123
    17380     0    .4597997281687194
    end
    format %td date
    In this case,
    1. I want to regress sp500 on chg_pay but with rolling window.
    2. I want to then plot the time series of coefficients for the 11 windows as a line plot.

    Could you please let me know how this can be done without writing codes for each window separately?

    What I have been doing is as follows:

    Code:
    **** 2006-2010
    
    keep if year(date)>=2006 & year(date)<=2010
    reg sp500 chg_pay
    .
    .
    .
    
    keep if year(date)>=2007 & year(date)<=2011
    reg sp500 chg_pay
    Thanks much!

  • #2
    Surely. rangestat from SSC offers one solution.

    Code:
    gen year = year(date)
    rangestat (reg) sp500 chg_pay, int(year -4 0)

    Comment


    • #3
      You can use asreg for this. More details on asreg can be found here https://fintechprofessor.com/2017/12...ions-in-stata/
      Code:
       gen year = year(date)
      
      . asreg sp500ret chg_pay, window(year 5)
      
      . list in 1/20
      
           +--------------------------------------------------------------------------------------------------+
           |      date   chg_pay     sp500ret   year   _Nobs        _R2       _adjR2   _b_chg_pay     _b_cons |
           |--------------------------------------------------------------------------------------------------|
        1. | 05jan2006         0    .00269935   2006      64   .0043398   -.01171923   -.04601674   .06172091 |
        2. | 06jan2006     -3.06    .96426619   2006      64   .0043398   -.01171923   -.04601674   .06172091 |
        3. | 12jan2006         0   -.62678799   2006      64   .0043398   -.01171923   -.04601674   .06172091 |
        4. | 19jan2006         0    .55707527   2006      64   .0043398   -.01171923   -.04601674   .06172091 |
        5. | 26jan2006         0    .72596599   2006      64   .0043398   -.01171923   -.04601674   .06172091 |
           |--------------------------------------------------------------------------------------------------|
        6. | 02feb2006         0   -.87951545   2006      64   .0043398   -.01171923   -.04601674   .06172091 |
        7. | 03feb2006     -1.86    -.5284534   2006      64   .0043398   -.01171923   -.04601674   .06172091 |
        8. | 09feb2006         0   -.14297327   2006      64   .0043398   -.01171923   -.04601674   .06172091 |
        9. | 16feb2006         0    .74071784   2006      64   .0043398   -.01171923   -.04601674   .06172091 |
       10. | 23feb2006         0   -.35153518   2006      64   .0043398   -.01171923   -.04601674   .06172091 |
           |--------------------------------------------------------------------------------------------------|
       11. | 02mar2006         0    -.1613969   2006      64   .0043398   -.01171923   -.04601674   .06172091 |
       12. | 09mar2006         0   -.48625264   2006      64   .0043398   -.01171923   -.04601674   .06172091 |
       13. | 10mar2006       .89     .7346771   2006      64   .0043398   -.01171923   -.04601674   .06172091 |
       14. | 16mar2006         0    .17956659   2006      64   .0043398   -.01171923   -.04601674   .06172091 |
       15. | 23mar2006         0    -.2576676   2006      64   .0043398   -.01171923   -.04601674   .06172091 |
           |--------------------------------------------------------------------------------------------------|
       16. | 30mar2006         0   -.19727134   2006      64   .0043398   -.01171923   -.04601674   .06172091 |
       17. | 06apr2006         0   -.16791975   2006      64   .0043398   -.01171923   -.04601674   .06172091 |
       18. | 07apr2006       .83   -1.0336369   2006      64   .0043398   -.01171923   -.04601674   .06172091 |
       19. | 13apr2006         0    .07735547   2006      64   .0043398   -.01171923   -.04601674   .06172091 |
       20. | 20apr2006         0    .11973841   2006      64   .0043398   -.01171923   -.04601674   .06172091 |
           +--------------------------------------------------------------------------------------------------+
      
      .
      Regards
      --------------------------------------------------
      Attaullah Shah, PhD.
      Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
      FinTechProfessor.com
      https://asdocx.com
      Check out my asdoc program, which sends outputs to MS Word.
      For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

      Comment


      • #4
        Hi Nick Cox and Attaullah Shah ,

        thank you for your suggestions, and they work good.

        But is there any way that I can create regression result tables for each rolling regression? (I mean he regression table that we normally see with 'reg' command or 'outreg' option)

        Thanks much!

        Comment


        • #5
          Also, I have an additional follow-up.

          After
          Code:
            rangestat (reg) sp500 chg_pay, int(year -4 0)
          If I try producing residuals for each observation, and run

          Code:
          predict resid, residuals
          I see some values produced. Howeverr, I doubt that it's correct.

          For example, for year 2010, the residuals should be displayed for observations in 2006-2010, but the stata dataset does not show it like that. It just shows residual values of 2010 for 2006-2010 regression, and I cannot figure out what those numbers are.

          Could you please let me know how I can produce correct residual values for each regression of rangestat (reg), if there is a way?

          Thanks much!

          Comment


          • #6
            Working backwards:

            #5 No, that's the wrong way to do it, but there is a right way.

            For a start, the regression code in rangestat is nothing to do with regress and leaves nothing that predict can work with. If you got a result, that's a side-effect of some model fit you did earlier. You are not expected to know that.

            Mire fundamentally, what you are expecting here? You just ran so many regressions, one for each distinct value of year. Each regression implies a set of residuals, and the residuals overlap in extent. So, even in principle there is no way to fit the residuals from those several regressions into one variable .

            More positively, you can calculate a single residual for each year by just using the results of rangestat. The residual is just each observed outcome MINUS an expression in the estimated coefficients and the predictors. rangestat does not calculate the residuals in a new variable, on the grounds that it is an easy calculation.

            #4 No, rangestat won't do that for you. You need to write your own loop, something like

            Code:
            su year, meanonly 
            
            local min = r(min) + 4 
            local max = r(max) 
            
            forval y = `min'/`max'  { 
                  regress sp500 chg_pay if inrange(year, `y' - 4, `y') 
            }
            But, but, but

            1. How you are going to process those results is up to you.
            2. Each regression's inferential results pay no attention to serial dependence and other problems likely in such data.
            3. The separate regressions are surely not independent as they are by definition for overlapping subsets of the data.

            I see regressions like this as at best descriptive, a kind of smoothing operation. Most of what rangestat doesn't provide is not especially useful.


            @Attaullah Shan can speak for asreg. Its goals and philosophy are not identical to those of rangestat, but my guess is that in the respects just mentioned the reaction is similar.

            Comment


            • #7
              Hi Nick,

              thank you so much for the descriptive explanation. I would dig into your suggestions.

              Thanks again,
              Jinny

              Comment


              • #8
                The flag in #6 should be Attaullah Shah Sorry to spell his name incorrectly.

                Comment

                Working...
                X