I am trying to conduct internal validation for a multivariable regression model, specifically testing for calibration. It is meant to go through these steps.
1) run a 10-fold cross-validation technique where the sample is divided into 10 parts, 9/10 are used as the derivation sample to build the model and the 1/10 is used as validation sample to test the model.
2) repeat the process 10 times, until all 10 divided parts of the sample are tested.
3) generate the average (mean) for prediction of the 10 iterations for the derivation sample and validation sample.
4) Using a graph, compare the proportions of predicted cases in derivation sample to the proportions of observed cases in validation sample stratified by deciles (first decile representing the lowest probabilities) of predicted probabilities.
I thought these two user-written commands can be useful: crossfold and pmcalplot.
But I don't know how to put them into practice to achieve the above and would very much appreciate your help.
So far I run this command:
crossfold clogit death i.ethnicity gender i.age_groups, group(matchedid) or k(10)
I got 10 variables (_est_est1 to _est_est10) each having only the integer 0 and 1 (I am not sure if I am on the right track with this command).
Would appreciate your help very much.
1) run a 10-fold cross-validation technique where the sample is divided into 10 parts, 9/10 are used as the derivation sample to build the model and the 1/10 is used as validation sample to test the model.
2) repeat the process 10 times, until all 10 divided parts of the sample are tested.
3) generate the average (mean) for prediction of the 10 iterations for the derivation sample and validation sample.
4) Using a graph, compare the proportions of predicted cases in derivation sample to the proportions of observed cases in validation sample stratified by deciles (first decile representing the lowest probabilities) of predicted probabilities.
I thought these two user-written commands can be useful: crossfold and pmcalplot.
But I don't know how to put them into practice to achieve the above and would very much appreciate your help.
So far I run this command:
crossfold clogit death i.ethnicity gender i.age_groups, group(matchedid) or k(10)
I got 10 variables (_est_est1 to _est_est10) each having only the integer 0 and 1 (I am not sure if I am on the right track with this command).
Would appreciate your help very much.
Comment