Dear members,
I'm developing a prediction model using internal-external cross-validation (IECV) based on five different geographical regions.
I have multiple imputed data and run IECV to loop over each region, fit the model, and estimate metrics on held out region data. The performance metrics I'm interested in are discrimination (C-statistic) and calibration (slope and calibration-in-the-large). Below is the code to run IECV and get the pooled results:
For region-level estimates, I get calibration results but have problems with the C-statistic:
(The results will later be pooled using random effects meta-analysis)
For C-statistic, it does not loop over the regions. Instead, it uses all data and estimates the same C-statistic five times.
So my question is: what do I need to change to get region-specific estimates for C-statistic?
Any help on this would be much appreciated.
Thank you
I'm developing a prediction model using internal-external cross-validation (IECV) based on five different geographical regions.
I have multiple imputed data and run IECV to loop over each region, fit the model, and estimate metrics on held out region data. The performance metrics I'm interested in are discrimination (C-statistic) and calibration (slope and calibration-in-the-large). Below is the code to run IECV and get the pooled results:
Code:
* Run IECV * forval x = 1(1)5 { mi estimate, dots saving(miestiecv, replace): logistic dep_var $covariates if (cluster!=`x') replace iecv_xb = xb if iecv_xb==. drop xb display `x' } ** Pooled metrics ** * Calibration slope * mi estimate, dots: logistic dep_var iecv_xb * Calibration-in-the-large * mi estimate, dots: logistic dep_var iecv_xb, offset(iecv_xb) * Discrimination * mi xeq 0: roctab dep_var iecv_xb return list cap program drop eroctab program eroctab, eclass version 12.0 /* Step 1: perform ROC analysis */ args refvar classvar roctab `refvar' `classvar' /* Step 2: save estimate and its variance in temporary matrices*/ tempname b V mat `b' = r(area) mat `V' = r(se)^2 local N = r(N) /* Step 3: make column names and row names consistent*/ mat colnames `b' = AUC mat colnames `V' = AUC mat rownames `V' = AUC /*Step 4: post results to e()*/ ereturn post `b' `V', obs(`N') ereturn local cmd "eroctab" ereturn local title "ROC area" end mi estimate, cmdok dots: eroctab dep_var iecv_xb
(The results will later be pooled using random effects meta-analysis)
Code:
* IECV for calibration slope * capture postutil clear tempname slope_region postfile `slope_region' slope slope_se val_size using slope_region.dta , replace forval x = 1(1)5 { mi estimate, dots: logistic dep_var iecv_xb if cluster==`x' local slope = r(table)[1,1] local slope_se = r(table)[2,1] local val_size = e(N) post `slope_region' (`slope') (`slope_se') (`val_size') } postclose `slope_region' * IECV for calibration-in-the-large * capture postutil clear tempname citl_region postfile `citl_region' citl citl_se val_size using citl_region.dta , replace forval x = 1(1)5 { mi estimate, dots: logistic dep_var iecv_xb if cluster==`x', offset(iecv_xb) local citl = r(table)[1,1] local citl_se = r(table)[2,1] local val_size = e(N) post `citl_region' (`citl') (`citl_se') (`val_size') } postclose `citl_region' * IECV for C-statistic * capture postutil clear tempname C_region postfile `C_region' beta st_err val_size using C_region.dta , replace forval x = 1(1)5 { mi estimate, cmdok dots: eroctab dep_var iecv_xb if cluster==`x' local beta = r(table)[1,1] local st_err = r(table)[2,1] local val_size = e(N) post `C_region' (`beta') (`st_err') (`val_size') } postclose `C_region'
So my question is: what do I need to change to get region-specific estimates for C-statistic?
Any help on this would be much appreciated.
Thank you