Hi,
I am trying to run a k-fold cross validation of a linear mixed effects model, using k-1 folds to predict the outcome in the fold left out of estimation.
When I use the predict command with the fitted option (i.e. when including contributions from random effects), Stata returns missing values for all observations in the fold left out.
A technical note in Stata's manual under mixed postestimation — Postestimation tools for mixed states:
Out-of-sample predictions are permitted after mixed, but if these predictions involve BLUPs of random effects, the integrity of the estimation data must be preserved. If the estimation data have changed since the mixed model was fit, predict will be unable to obtain predicted random effects that are appropriate for the fitted model and will give an error. Thus to obtain out-of-sample predictions that contain random-effects terms, be sure that the data for these predictions are in observations that augment the estimation data.
This suggests out of sample predictions involving BLUPs are possible. But I am not sure I understand what is meant by preserving the integrity of the estimation data / be sure that the data for these predictions are in observations that augment the estimation data. Can someone please elaborate?
Many thanks.
I am trying to run a k-fold cross validation of a linear mixed effects model, using k-1 folds to predict the outcome in the fold left out of estimation.
When I use the predict command with the fitted option (i.e. when including contributions from random effects), Stata returns missing values for all observations in the fold left out.
A technical note in Stata's manual under mixed postestimation — Postestimation tools for mixed states:
Out-of-sample predictions are permitted after mixed, but if these predictions involve BLUPs of random effects, the integrity of the estimation data must be preserved. If the estimation data have changed since the mixed model was fit, predict will be unable to obtain predicted random effects that are appropriate for the fitted model and will give an error. Thus to obtain out-of-sample predictions that contain random-effects terms, be sure that the data for these predictions are in observations that augment the estimation data.
This suggests out of sample predictions involving BLUPs are possible. But I am not sure I understand what is meant by preserving the integrity of the estimation data / be sure that the data for these predictions are in observations that augment the estimation data. Can someone please elaborate?
Many thanks.
Comment