Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Why is my -rolling- code not working? I changed one thing in the example programme and it went crashing...

    Good afternoon,

    I am trying to make -rolling- work on the basis of a programme provided in the -rolling- manual. I change one line, and it does not work anymore, and I do not see why. This is my code:

    Code:
    use ibm, clear
    
    
    tsset t
    cap prog drop myforecast 
    program myforecast, rclass
    syntax [if]
    regress ibm L.ibm L.spx `if'
    
    // Find last time period of estimation sample and
    // make forecast for period just after that
    summ t if e(sample)
    local last = r(max)
    predict fcast in `=`last'+1'
    return scalar forecast = fcast[`last'+1]
    // Next period's actual return
    // Will return missing value for final period
    return scalar actual = ibm[`last'+1]
    end
    
     rolling actual=r(actual) forecast=r(forecast), recursive window(20): myforecast
    and the line that is apparently causing the problem is in red.

    This is the code provided in the manual, and it works:

    Code:
    use ibm, clear
    
    
    tsset t
    cap prog drop myforecast 
    program myforecast, rclass
    syntax [if]
    regress ibm L.ibm L.spx `if'
    
    // Find last time period of estimation sample and
    // make forecast for period just after that
    summ t if e(sample)
    local last = r(max)
    local fcast = _b[_cons] + _b[L.ibm]*ibm[`last'] + ///
    _b[L.spx]*spx[`last']
    return scalar forecast = `fcast'
    // Next period's actual return
    // Will return missing value for final period
    return scalar actual = ibm[`last'+1]
    end
    
     rolling actual=r(actual) forecast=r(forecast), recursive window(20): myforecast
    and it works.

  • #2
    "does not work" is an enigmatic problem report. See advice against such at https://www.statalist.org/forums/help#stata 12.1

    Here is a guess.


    predict requires a new variable. So predictably or at least presumably, the second time your program runs the command will fail as trying to change an existing variable.

    Otherwise put, it is the job of the program you write to produce scalar output and the job of rolling to assemble such output into new variables. You're subverting that carefully set up division of labour, myforecast shouldn't try to create new variables, as it is to be run repeatedly.

    Comment


    • #3
      Lingering variable is not the problem, because first I had another programme that I wrote myself without reference to the manual and where I was dropping, and it did not work either. And second when I add drop the prediction before the end of the programme, it still does not work.

      Now how the programme does not work is hard to explain, because the programme actually works. Typing from command line

      Code:
      myforecast if t<200
      results in the programme correctling executing and returning what it is supposed to return:

      Code:
      . return list
      
      scalars:
                   r(actual) =  -3.730859518051147
                 r(forecast) =  .0665557608008385
      which happen to be the correct predictions for t=200, and the correct value for ibm for t=200.

      How -rolling- is messing things up is harder to explain. It crashes with the error message

      Code:
      .  rolling actual=r(actual) forecast=r(forecast), recursive window(20): myforecast
      (running myforecast on estimation sample)
      observation numbers out of range
      an error occurred when rolling executed myforecast
      r(198);
      
      end of do-file
      
      r(198);
      I have been staring at the -trace on- results the whole morning, and it seems to me that when I run -rolling-, somehow the `if' condition, or alternatively the "if e(sample)" condition is ignored.

      Because this is what I see after -set trace on-

      Code:
          - syntax [if]
            - regress ibm L.ibm L.spx `if'
            = regress ibm L.ibm L.spx 
            - summ t if e(sample)
            - local last = r(max)
            - predict fcast in `=`last'+1'
            = predict fcast in 495
      observation numbers out of range
      The total number of observations is 494, so I have no idea is this an error message generated at the last run on -rolling-, or is it a generic problem that somehow the `if' conditioning gets lost.

      Comment


      • #4
        I initially thought that -rolling- might be crashing on the last run, because then the observation is really out of range. But running

        Code:
        rolling actual=r(actual) forecast=r(forecast), recursive window(20) end(300): myforecast
        I still see

        Code:
         - predict fcast in `=`last'+1'
            = predict fcast in 495
              -------------------------------------------------------------------------- begin predict ---
              - version 8.2, missing
              - if "`e(cmd)'" == "rocreg" & "`e(predict)'" == "" {
              = if "regress" == "rocreg" & "regres_p" == "" {
                di as err "predict not allowed after nonparametric ROC"
                exit 198
                }
              - if "`e(mi)'"!="" & "`e(b)'"!="matrix" {
              = if ""!="" & "matrix"!="matrix" {
                error 321
                }
              - if _caller()<=5 | "`e(predict)'"=="" {
              = if _caller()<=5 | "regres_p"=="" {
                _predict `0'
                }
              - else {
              - local v : display string(_caller())
              - version `v', missing
              = version 17, missing
              - `e(predict)' `0'
              = regres_p fcast in 495
        observation numbers out of range
                }
        So then I wondered what 495, I told explicitly -rolling- to stop at 300...

        Comment


        • #5
          It's not a surprise to me that myforecast works by itself. The issue I raised is how it works when called repeatedly by rolling, but what you now report allows a better story. You are trying to predict beyond the dataset currently in memory. I lack the inclination to try to follow exactly what rolling is doing, but evidently it doesn't allow what you tried.

          Comment


          • #6
            For one thing is fairly dumb behavour of -rolling- to abort the whole mission just because it cannot perform the mission on the last sample observation.

            For another this does not seem to be the problem because when I set -end(somwhere midsample)- the error still occurs.

            Looking through 600 lines of code is no good use of anybody's time except for the author.

            My question was more:

            Does anybody see anything fishy by comparing the code with the line I changed (the one in red), and the original code provided by Stata Corp provided in #1 ?

            Comment


            • #7
              FWIW, the code in #4 behaves as expected by extending the observation range.


              Code:
              clear all 
              use "https://www.stata-press.com/data/r17/ibm"
              
              tsset t 
              cap prog drop myforecast 
              program myforecast, rclass
              syntax [if]
              regress ibm L.ibm L.spx `if'
              // Find last time period of estimation sample and
              // make forecast for period just after that
              summ t if e(sample)
              local last = r(max)
              tsappend, add(1)
              predict fcast in `=`last'+1'
              return scalar forecast = fcast[`last'+1]
              // Next period's actual return
              // Will return missing value for final period
              return scalar actual = ibm[`last'+1]
              end
              
              rolling actual=r(actual) forecast=r(forecast), recursive window(20) end(300): myforecast
              
              list in `=_N', noobs
              
                +----------------------------------+
                | start   end    actual   forecast |
                |----------------------------------|
                |     1   300   2.29142   .1570406 |
                +----------------------------------+

              Comment


              • #8
                Also, I know how to do this with an explicit loop, and I have already done the job with an explicit loop.

                What I am wondering in this exercise is whether -rolling- is good for anything, and in particular whether it does the job faster than an explicit loop.

                I did these things for Kolev and Karapandza (2017) with explicit loops, and I had to run code for weeks, because I was also bootstrapping the whole thing. As I think I might want to bootstrap again such objects, I am wondering of the option that does the job fastest.

                Kolev, Gueorgui I., and Rasa Karapandza. "Out-of-sample equity premium predictability and sample split–invariant inference." Journal of Banking & Finance 84 (2017): 188-201.

                Comment


                • #9
                  This is brilliant, Justin Niakamal ! In a life time it would not have occured to me that this is the problem. For one as I already argued the problem is dumb and the author of -rolling- should have throught how to execute the command when it can be executed, and just return missing when it cannot be executed, without nuking the whole endeavour just because it cannot be executed on the last observation. For another the problem should have been resolved by adding the -end(time period)- option...

                  In any case we work with what we have...

                  Justin, do you approve of the following modification of your code -- the only changed line is in red. (I am after achieving speed here, and trying to make this as fast as possible):

                  Code:
                  clear all 
                  use "https://www.stata-press.com/data/r17/ibm"
                  
                  tsset t 
                  cap prog drop myforecast 
                  program myforecast, rclass
                  syntax [if]
                  regress ibm L.ibm L.spx `if'
                  // Find last time period of estimation sample and
                  // make forecast for period just after that
                  summ t if e(sample)
                  local last = r(max)
                  if `last' == _N tsappend, add(1)
                  predict fcast in `=`last'+1'
                  return scalar forecast = fcast[`last'+1]
                  // Next period's actual return
                  // Will return missing value for final period
                  return scalar actual = ibm[`last'+1]
                  end
                  
                  rolling actual=r(actual) forecast=r(forecast), recursive window(20) : myforecast
                  
                  list in `=_N', noobs

                  Comment


                  • #10
                    I tend to encounter this type of error in EViews (eg. "Forecast out of range"). I don't see any issues with that modification.

                    Comment


                    • #11
                      From post #3, with the key part of the output highlighted in red
                      Code:
                      .  rolling actual=r(actual) forecast=r(forecast), recursive window(20): myforecast
                      (running myforecast on estimation sample)
                      observation numbers out of range
                      an error occurred when rolling executed myforecast
                      r(198);
                      
                      end of do-file
                      
                      r(198);
                      This occurs before the dots for the rolling iterations begin appearing. The rolling command runs an initial pass on the entire estimation sample, for reasons I did not find discussed in the documentation. And that is when the prediction for observation _N+1 occurs, not as part of the rolling process.

                      This also explains why the error was not eliminated by adding the end(300) option. You do not in any event need to use end() to tell rolling when to stop; the Remarks and examples section of the documentation tells us it will stop when the end of the window lies beyond the final observation.

                      With a slight change to the rolling command, your code from post #1 with the predict command works as you might have expected, unless you actually wanted the prediction for t=495 for which no corresponding actual value is available.
                      Code:
                      . rolling actual=r(actual) forecast=r(forecast) if t<_N, recursive window(20): myforecast
                      (running myforecast on estimation sample)
                      
                      Rolling replications (474)
                      ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
                      ..................................................    50
                      ..................................................   100
                      ..................................................   150
                      ..................................................   200
                      ..................................................   250
                      ..................................................   300
                      ..................................................   350
                      ..................................................   400
                      ..................................................   450
                      ........................
                      
                      . list in l
                      
                           +-----------------------------------+
                           | start   end     actual   forecast |
                           |-----------------------------------|
                      474. |     1   493   .1232869   .0466082 |
                           +-----------------------------------+
                      Last edited by William Lisowski; 28 Jul 2022, 11:04.

                      Comment


                      • #12
                        Thank you William Lisowski , your solution is the most elegant and slightly faster.

                        If only I could learn to read the manual and the messages the command issues like you do :P.

                        Comment


                        • #13
                          To wrap up this thread, -rolling- does not provide any speed improvements. An explicit loop such as

                          Code:
                          qui {
                          gen actual = .
                          gen forecast = .
                          
                          forvalues i = 21/`= _N' {
                              regress ibm L.ibm L.spx if t<`i'
                              predict fcast in `i'
                              replace forecast = fcast in `i'
                              replace actual = ibm in `i'
                              drop fcast
                            }
                          }
                          results in faster execution that -rolling-. So -rolling- is all pain and no gain.

                          Here are the timings:

                          Code:
                          . timer list 
                             1:      3.67 /        1 =       3.6740
                             2:      4.42 /        1 =       4.4250
                             3:      3.26 /        1 =       3.2630
                             4:      2.76 /        1 =       2.7560
                          where
                          timer 1 is my attempt to speed -rolling- up in #9,
                          timer 2 is Justin's original solution to the problem in #7,
                          timer 3 is William's solution in #11,
                          and timer 4 is the explicit loop above.

                          Comment

                          Working...
                          X