I am a PhD Student conducting analysis of hospital data in the UK with a large dataset in stata (approx 9 million observations). I am running a predictive model with 9 variables included (V1-V9), and some of the variables have missing values (V1-V4) which has resulted in cases being dropped from the model. I therefore want to impute data for the missing variables and run the model with imputed data. I want to control for hospital provider in the model and I've been advised to run the imputation and cluster for hospital provider, but I'm struggling to find the syntax to do this. Below is the syntax I am using so far without clustering/controlling for hospital provider:
mi impute mlong
mi register imputed V1 V2 V3 V4
mi impute mvn V1 V2 V3 V4 = V5 V6 V7 V8 V9, add(5)
mi estimate, or: logit V1 V2 V3 V4 V5 V6 V7 V8 V9
Please can you advise how I can incorporate the hospital provider variable to cluster the imputation around this?
Thanks in advance
mi impute mlong
mi register imputed V1 V2 V3 V4
mi impute mvn V1 V2 V3 V4 = V5 V6 V7 V8 V9, add(5)
mi estimate, or: logit V1 V2 V3 V4 V5 V6 V7 V8 V9
Please can you advise how I can incorporate the hospital provider variable to cluster the imputation around this?
Thanks in advance

Comment