Clark and West (2007) test for out-of-sample prediction in Stata

Mike Kraft

Join Date: Dec 2014

Posts: 328
#1

Clark and West (2007) test for out-of-sample prediction in Stata

20 Mar 2020, 01:08

Dear All
Is there a program in Stata that can perform the Clark and West (2007) test for out-of-sample prediction?

CW= (actual- Prediction_Bencht)^2 - [(actual - Prediction)^2 - (Prediction_Bencht - Prediction)^2],

Calculating the CW statistic may not be complicated but I am not sure how to get the CW p-values to determine the significance?!

Ref: Clark, T., and West, K. 2007. Approximately normal tests for equal predictive accuracy in nested models. Journal of Econometrics.
Tags: None
Mike Kraft

Join Date: Dec 2014

Posts: 328
#2

20 Mar 2020, 23:40

Hi all
I thought to bring this up again after a day. Just to add the data is time-series.
Thanks
Comment
Scott Merryman

Join Date: Mar 2014

Posts: 895
#3

21 Mar 2020, 04:15

This may not be helpful but there was a Stata blog post on "Tests of forecast accuracy and forecast encompassing":

https://blog.stata.com/2016/06/01/te...-encompassing/
1 like
Comment

Mike Kraft

Join Date: Dec 2014
Posts: 328

23 Mar 2020, 19:40

Thanks. I have read the post in the link provided by Scott in #3.

I tried to customize the first code to my data as following:

Code:

program fcst_risk, rclass
        syntax [if]
 
        qui tsset
        local timevar `r(fqdate)'
        local first `r(tmin)'
 
        regress GDP L.q_HML L.q_SMB L.q_MOM L.q_RMW L.q_CMA  `if'
        summarize `timevar' if e(sample)
 
        local last = `r(max)'-`first'+1
        local fcast = _b[_cons] + _b[L.q_HML]* q_HML[`last'] ///
                + _b[L.q_SMB]*q_SMB[`last'] ///
                + _b[L.q_MOM]*q_MOM[`last'] ///
                + _b[L.q_RMW]*q_RMW[`last'] ///
                + _b[L.q_CMA]*q_CMA[`last'] ///
        return scalar fcast = `fcast'
        return scalar actual = dtfp[`last'+1]
        return scalar sqerror = (dtfp[`last'+1]-  ///
                `fcast')^2
end

rolling ar2_sqerr=r(sqerror) ar2_fcast=r(fcast) actual=r(actual), window(60) recursive saving(ar2, replace): fcst_risk

However I got an error message:

Code:

(running fcst_risk on estimation sample)
invalid 'return'
an error occurred when rolling executed fcst_risk
r(198);

Does this mean that I have an error in the program code itself?
The only changes I made is changing the program name, defined timevar consistent with my date variable: local timevar `r(fqdate)', using the variables in my dataset in the regression and estimation.

Can anyone spot what might have been wrong in my execution of the code?

Thanks

Comment

Scott Merryman

Join Date: Mar 2014

Posts: 895
#5

24 Mar 2020, 05:47

You have line join indicator at the end of -local fcast = ...- :

Code:

+ _b[L.q_CMA]*q_CMA[`last'] /// return scalar fcast = `fcast'
Comment
Mike Kraft

Join Date: Dec 2014

Posts: 328
#6

25 Mar 2020, 22:36

Thanks a lot.
Do you also know what values should be for Pi and Kappa to select the critical values? In the post, it says 0.2 for Pi and 2 for Kappa. On what basis, we should choose these?
Thanks
Comment
Scott Merryman

Join Date: Mar 2014

Posts: 895
#7

26 Mar 2020, 03:58

k2 is the number of additional regressors in the larger model

pi = P/R where P is the number of out-of-sample observations and R is the size of the estimation window.

There is working paper version of "McCracken, M. W. 2007. Aymptotics for out of sample tests of Granger causality. Journal of Econometrics 140: 719–752." with the critical values located at:

https://citeseerx.ist.psu.edu/viewdo...=rep1&type=pdf
2 likes
Comment
Mike Kraft

Join Date: Dec 2014

Posts: 328
#8

07 Apr 2020, 07:16

Thanks.
I have done the test now. I have P=73 and the estimation window (rolling) is 100. This gives me pi=0.73.
Unfortunately, the tables show only critical values for pi of 0.6 and 0.8 but not 0.7. I am not sure against which critical values (i.e. those of pi 0.6 or pi 0.8) I should compare my results?

I appreciate your help
Comment
Scott Merryman

Join Date: Mar 2014

Posts: 895
#9

07 Apr 2020, 08:17

Mike Kraft I would linearly interpolate between the values. Suppose pi = .6 and the critical value is 1.515 (the left value); pi = .8 critical value = 1.462 (the right). And your pi value = .73. Then:

(1 - (pi - pi_left)/(pi_right - pi_left))*crititcalvalue_left + (pi - pi_left)/(pi_right - pi_left)*crititcalvalue_right

=( 1 -(.73-.6)/(.8-.6))*1.515 + (.73-.6)/(.8-.6))*1.462
=.35*1.515 + .65*1.462
=1.40855
Comment
Mike Kraft

Join Date: Dec 2014

Posts: 328
#10

07 Apr 2020, 09:06

Thanks a lot.

1- Does it make sense to just compare with the 0.6 critical values on the ground that my pi did not reach the 0.8 level and make this clear in the paper? or is it the linear interpolation that researchers would normally do in this situation?!

2- In the encompassing test, I found that the larger model has MSE that is very slightly larger than the smaller model. However the ENC-NEW test shows that the additional variable in the larger model contains additional information (only significant at 10% level). Is this possible?

Regards
Comment
Scott Merryman

Join Date: Mar 2014

Posts: 895
#11

07 Apr 2020, 10:18

1. I believe interpolation would be the more common approach. From footnote 9 of McCracken 2007 paper referenced above:

The tables for the OOS-F and OOS-t are inadequate for every forecast origin. We use linear interpolation when the ratio of P/R does not coincide with those found in the tables.

2. I don't how the relative power of tests compare. Sounds like the models are very similar.
1 like
Comment
Mike Kraft

Join Date: Dec 2014

Posts: 328
#12

07 Apr 2020, 22:07

Hi
1- The Stata post in #3 above shows how to calculate the OOS-F, OOS-T and ENC-NEW and then compares the statistics with the relevant critical values. It does not show how the P-value can be calculated.
Can you Scott and other participants help in writing few codes showing how to calculate the P-values for these ?

2- Linear interpolation sounds helpful. I looked at McCracken 2007 paper to see the footnote mentioned by Scott in #11 and I found that the footnote was in the unpublished version of the paper and not in the published version. I just wonder if it is still reasonable to refer to the unpublished version though for this purpose?
Comment
Mike Kraft

Join Date: Dec 2014

Posts: 328
#13

08 Apr 2020, 01:01

Originally posted by Scott Merryman View Post

Mike Kraft I would linearly interpolate between the values. Suppose pi = .6 and the critical value is 1.515 (the left value); pi = .8 critical value = 1.462 (the right). And your pi value = .73. Then:

(1 - (pi - pi_left)/(pi_right - pi_left))*crititcalvalue_left + (pi - pi_left)/(pi_right - pi_left)*crititcalvalue_right

=( 1 -(.73-.6)/(.8-.6))*1.515 + (.73-.6)/(.8-.6))*1.462
=.35*1.515 + .65*1.462
=1.40855

a quick correction:
I think your calculation should give 1.48055 and not 1.40855
Comment
Scott Merryman

Join Date: Mar 2014

Posts: 895
#14

08 Apr 2020, 03:37

Yes, thanks.
Comment
Mike Kraft

Join Date: Dec 2014

Posts: 328
#15

08 Apr 2020, 06:49

Originally posted by Mike Kraft View Post

Hi
1- The Stata post in #3 above shows how to calculate the OOS-F, OOS-T and ENC-NEW and then compares the statistics with the relevant critical values. It does not show how the P-value can be calculated.
Can you Scott and other participants help in writing few codes showing how to calculate the P-values for these ?

2- Linear interpolation sounds helpful. I looked at McCracken 2007 paper to see the footnote mentioned by Scott in #11 and I found that the footnote was in the unpublished version of the paper and not in the published version. I just wonder if it is still reasonable to refer to the unpublished version though for this purpose?

I have posted two questions here followed by another post. I hope also to have some answers to these ones.
Thanks all
Comment

Announcement