Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Out-of-sample R2 forecasting

    Hi everyone.

    Im currently writing my thesis and got advised to compare In-sample vs Out-of-sample prediction. Where R2 out of sample is defined as:
    R2 OOS.PNG
    My data set is for 1764 observations (months)

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    dataex return t, count(30)
    clear
    input double return float t
     .01242389  1
     .02019115  2
     .02964644  3
     .03405454  4
     .02848361  5
    -.01542547  6
    -.00863808  7
     .01915629  8
     .01136751  9
    -.05886838 10
     .03742423 11
      .0257288 12
     .02951672 13
     .00101633 14
     .04072984 15
     .03142682 16
    -.00213872 17
    -.00795212 18
    -.00034236 19
    -.00761387 20
    -.01335902 21
     .01832136 22
    -.00212152 23
     .03346823 24
     .01082923 25
     .01575765 26
    -.01307993 27
    -.00878622 28
     .01496919 29
    -.01535528 30
    end
    The code that i tried to run, (by dividing data set 50/50 by training period and forecasting period).
    Code:
    gen t=_n
    tsset t
    gen segment = 1
    replace segment = 0 if t<883
    reg return L.return if segment ==0
    predict forecast if segment == 1
    gen returns = return if segment == 1 
    reg return forecast
    However, i am very new to STATA and would appreciate if someone could give me some feedback / suggestions on improvements.

  • #2
    You could rewrite this as:

    gen t=_n
    tsset t
    gen segment = (t >883 & t<.)
    reg return L.return if segment ==0
    predict forecast if segment == 1
    reg return forecast if segment == 1

    However, I'm not sure your regression is really what you have in the picture (note that we discourage pictures - I can't keep the picture open when I type here). It looks to me like you can use the formula directly with forecast to calculate the r-square. Remember, if you run

    su return

    then in r(mean) you'll have the mean.

    Looking at your equation, you might try something like:


    su return if segment==0
    local r_bar=r(mean)
    reg return L.return if segment ==0
    predict forecast if segment == 1
    predict res if segment == 1,res
    g r_rhat_sq=res * res if segment == 1
    g r_rbar_sq=(forecast - `rbar')^2 if segment == 1
    su r_rhat_sq
    local sumtop=r(sum)
    su r_rbar_sq
    local sumbot=r(sum)
    local rsq=1 - (`sumtop'/`sumbot')
    di "`rsq'"

    Comment

    Working...
    X