Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Clark and West (2007) test for out-of-sample prediction in Stata

    Dear All
    Is there a program in Stata that can perform the Clark and West (2007) test for out-of-sample prediction?

    CW= (actual-  Prediction_Bencht)^2 -  [(actual - Prediction)^2 -  (Prediction_Bencht - Prediction)^2],

    Calculating the CW statistic may not be complicated but I am not sure how to get the CW p-values to determine the significance?!

    Ref: Clark, T., and West, K. 2007. Approximately normal tests for equal predictive accuracy in nested models. Journal of Econometrics.

  • #2
    Hi all
    I thought to bring this up again after a day. Just to add the data is time-series.
    Thanks

    Comment


    • #3
      This may not be helpful but there was a Stata blog post on "Tests of forecast accuracy and forecast encompassing":

      https://blog.stata.com/2016/06/01/te...-encompassing/

      Comment


      • #4
        Thanks. I have read the post in the link provided by Scott in #3.

        I tried to customize the first code to my data as following:

        Code:
        program fcst_risk, rclass
                syntax [if]
         
                qui tsset
                local timevar `r(fqdate)'
                local first `r(tmin)'
         
                regress GDP L.q_HML L.q_SMB L.q_MOM L.q_RMW L.q_CMA  `if'
                summarize `timevar' if e(sample)
         
                local last = `r(max)'-`first'+1
                local fcast = _b[_cons] + _b[L.q_HML]* q_HML[`last'] ///
                        + _b[L.q_SMB]*q_SMB[`last'] ///
                        + _b[L.q_MOM]*q_MOM[`last'] ///
                        + _b[L.q_RMW]*q_RMW[`last'] ///
                        + _b[L.q_CMA]*q_CMA[`last'] ///
                return scalar fcast = `fcast'
                return scalar actual = dtfp[`last'+1]
                return scalar sqerror = (dtfp[`last'+1]-  ///
                        `fcast')^2
        end
        
        rolling ar2_sqerr=r(sqerror) ar2_fcast=r(fcast) actual=r(actual), window(60) recursive saving(ar2, replace): fcst_risk
        However I got an error message:
        Code:
        (running fcst_risk on estimation sample)
        invalid 'return'
        an error occurred when rolling executed fcst_risk
        r(198);
        Does this mean that I have an error in the program code itself?
        The only changes I made is changing the program name, defined timevar consistent with my date variable: local timevar `r(fqdate)', using the variables in my dataset in the regression and estimation.

        Can anyone spot what might have been wrong in my execution of the code?

        Thanks


        Comment


        • #5
          You have line join indicator at the end of -local fcast = ...- :
          Code:
          + _b[L.q_CMA]*q_CMA[`last'] ///
           return scalar fcast = `fcast'

          Comment


          • #6
            Thanks a lot.
            Do you also know what values should be for Pi and Kappa to select the critical values? In the post, it says 0.2 for Pi and 2 for Kappa. On what basis, we should choose these?
            Thanks

            Comment


            • #7
              k2 is the number of additional regressors in the larger model

              pi = P/R where P is the number of out-of-sample observations and R is the size of the estimation window.

              There is working paper version of "McCracken, M. W. 2007. Aymptotics for out of sample tests of Granger causality. Journal of Econometrics 140: 719–752." with the critical values located at:

              https://citeseerx.ist.psu.edu/viewdo...=rep1&type=pdf

              Comment


              • #8
                Thanks.
                I have done the test now. I have P=73 and the estimation window (rolling) is 100. This gives me pi=0.73.
                Unfortunately, the tables show only critical values for pi of 0.6 and 0.8 but not 0.7. I am not sure against which critical values (i.e. those of pi 0.6 or pi 0.8) I should compare my results?

                I appreciate your help

                Comment


                • #9
                  Mike Kraft I would linearly interpolate between the values. Suppose pi = .6 and the critical value is 1.515 (the left value); pi = .8 critical value = 1.462 (the right). And your pi value = .73. Then:


                  (1 - (pi - pi_left)/(pi_right - pi_left))*crititcalvalue_left + (pi - pi_left)/(pi_right - pi_left)*crititcalvalue_right

                  =( 1 -(.73-.6)/(.8-.6))*1.515 + (.73-.6)/(.8-.6))*1.462
                  =.35*1.515 + .65*1.462
                  =1.40855

                  Comment


                  • #10
                    Thanks a lot.

                    1- Does it make sense to just compare with the 0.6 critical values on the ground that my pi did not reach the 0.8 level and make this clear in the paper? or is it the linear interpolation that researchers would normally do in this situation?!

                    2- In the encompassing test, I found that the larger model has MSE that is very slightly larger than the smaller model. However the ENC-NEW test shows that the additional variable in the larger model contains additional information (only significant at 10% level). Is this possible?

                    Regards

                    Comment


                    • #11
                      1. I believe interpolation would be the more common approach. From footnote 9 of McCracken 2007 paper referenced above:
                      The tables for the OOS-F and OOS-t are inadequate for every forecast origin. We use linear interpolation when the ratio of P/R does not coincide with those found in the tables.
                      2. I don't how the relative power of tests compare. Sounds like the models are very similar.

                      Comment


                      • #12
                        Hi
                        1- The Stata post in #3 above shows how to calculate the OOS-F, OOS-T and ENC-NEW and then compares the statistics with the relevant critical values. It does not show how the P-value can be calculated.
                        Can you Scott and other participants help in writing few codes showing how to calculate the P-values for these ?

                        2- Linear interpolation sounds helpful. I looked at McCracken 2007 paper to see the footnote mentioned by Scott in #11 and I found that the footnote was in the unpublished version of the paper and not in the published version. I just wonder if it is still reasonable to refer to the unpublished version though for this purpose?

                        Comment


                        • #13
                          Originally posted by Scott Merryman View Post
                          Mike Kraft I would linearly interpolate between the values. Suppose pi = .6 and the critical value is 1.515 (the left value); pi = .8 critical value = 1.462 (the right). And your pi value = .73. Then:


                          (1 - (pi - pi_left)/(pi_right - pi_left))*crititcalvalue_left + (pi - pi_left)/(pi_right - pi_left)*crititcalvalue_right

                          =( 1 -(.73-.6)/(.8-.6))*1.515 + (.73-.6)/(.8-.6))*1.462
                          =.35*1.515 + .65*1.462
                          =1.40855
                          a quick correction:
                          I think your calculation should give 1.48055 and not 1.40855

                          Comment


                          • #14
                            Yes, thanks.

                            Comment


                            • #15
                              Originally posted by Mike Kraft View Post
                              Hi
                              1- The Stata post in #3 above shows how to calculate the OOS-F, OOS-T and ENC-NEW and then compares the statistics with the relevant critical values. It does not show how the P-value can be calculated.
                              Can you Scott and other participants help in writing few codes showing how to calculate the P-values for these ?

                              2- Linear interpolation sounds helpful. I looked at McCracken 2007 paper to see the footnote mentioned by Scott in #11 and I found that the footnote was in the unpublished version of the paper and not in the published version. I just wonder if it is still reasonable to refer to the unpublished version though for this purpose?
                              I have posted two questions here followed by another post. I hope also to have some answers to these ones.
                              Thanks all

                              Comment

                              Working...
                              X