Is there a way in Stata to use multiply-imputed variables to subset data?
I am analyzing a clinical dataset with a high degree of missingness in the two variables, P and W, that are required to subset the data.
I would like to impute P and W, and then effectively filter out observations for which P >200 and W >20.
However, -mi est- does not permit "if" statements, understandably: "estimation sample varies between m=1 and m=2 ... subsample [] changes from one imputation to another."
Collapsing the imputed dataset on individual observations to return the mean of the imputed value seems problematic from a statistical perspective although that is just my intuition, i.e., it seems to undermine the entire premise of multiple imputation, though I am not a card-carrying statistician.
Imputation using standard regression methods does not perform very well (e.g., adj R^2 <0.10 in the best fitting models), which may indicate this exercise is a lost cause, but I was hoping to get others' input before giving up.
Thank you.
I am analyzing a clinical dataset with a high degree of missingness in the two variables, P and W, that are required to subset the data.
I would like to impute P and W, and then effectively filter out observations for which P >200 and W >20.
However, -mi est- does not permit "if" statements, understandably: "estimation sample varies between m=1 and m=2 ... subsample [] changes from one imputation to another."
Collapsing the imputed dataset on individual observations to return the mean of the imputed value seems problematic from a statistical perspective although that is just my intuition, i.e., it seems to undermine the entire premise of multiple imputation, though I am not a card-carrying statistician.
Imputation using standard regression methods does not perform very well (e.g., adj R^2 <0.10 in the best fitting models), which may indicate this exercise is a lost cause, but I was hoping to get others' input before giving up.
Thank you.
Comment