Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Repeated k-fold cross validation

    Hello everyone,

    I want to perform a repeated (1000 times) 20-fold cross validation to a bunch of models in order to understand which is the one producing the lowest MSE. Therefore, I want to create a variable storing at each repetition the MSE produced. Assuming that this is the right way to do repeated cross validation (if you have better ideas let me know!), I am not able to do this simple task, since I am relatively new with loops in Stata.


    This is the code for the first model I have, I know it is clearly wrong, so I beg pardon in advance.


    gen meanvec1=0

    forvalues rep=1(1)1000 {

    crossfold reg Consumption Unemployment, stub(k) k(20)

    mat U = J(rowsof(r(k)),1,1)

    mat sum = U'*r(k)

    svmat sum


    * I do not know how to impose a "if" condition which is met when the two variables reps and trend have the same numeric value!

    replace meanvec1= sum1/rowsof(r(k)) if `reps'==trend


    drop sum1
    }
    I am certain that there is a much more elegant way of coding this, apart from the fact that my code is not correct. Anyone could help me?
    Last edited by Marco Mello; 29 Jul 2018, 11:50.

  • #2
    Hi Marco
    I may direct you to two other commands that will help you with this. -loocv- and cv_regress. Both are user written programs that do Leave one out cross validation, and provide the same reports.
    The only difference may be that loocv is more computationally intensive, but can be applied to various type of models, whereas cv_regress is faster, but can be applied only after regress.
    Best

    Comment

    Working...
    X