Hello everyone.
I have a long clustered data set with anthropometric data (height and weight) at different time points. It is possible that there are missing values in height and/or weight because the person did not come on a follow-up visit and I need to impute them (50% missing values in each of the two variables).
I have never done imputation before and I tried doing the following:
My questions are:
1. The main outcome of the analysis is binary (yes/no for being a false reporter). This outcome variable is dependent on the height and the weight of a person. I am not really sure if I should calculate this outcome variable after imputation or should I should do it before the imputation and later replace it. I am also not sure how to do the replace if I have several imputations.
2. I used linear regression for continuous variable (regress) as the imputation procedure. Will if be a problem if I do multilevel modelling (using MLwiN) in the complete dataset?
3. Is it possible to check if the imputation was done correctly or not (continuous variable)?
4. All the imputed values for both height and weight are generated under the original file. Is it possible to match the imputed values?
5. Lastly, I came across REALCOM impute. Correct me if I am wrong, it is a package that can be installed in STATA so that another software can be used to impute data. Imputation can be done successfully without using REALCOM as well.
Thank you in advance.
I have a long clustered data set with anthropometric data (height and weight) at different time points. It is possible that there are missing values in height and/or weight because the person did not come on a follow-up visit and I need to impute them (50% missing values in each of the two variables).
I have never done imputation before and I tried doing the following:
Code:
*STEP 1: Setting the dataset mi set mlong *STEP 2: Registering variables (where znr, pnr = identification variables; description2 = age; gcal_new ze_new zf_new zk_new = nutrient intake) mi register imputed height weight mi register regular znr pnr description2 gcal_new ze_new zf_new zk_new gender *STEP 3: Checking the imputation model, i.variable=indicator variable regress height i.description2 i.gender rvfplot, yline(0) regress weight i.description2 i.gender rvfplot, yline(0) *STEP 4: Imputation mi impute regress weight description2 gender, add(5) rseed(1234) mi impute regress height description2 gender, add(5) rseed(1234)
1. The main outcome of the analysis is binary (yes/no for being a false reporter). This outcome variable is dependent on the height and the weight of a person. I am not really sure if I should calculate this outcome variable after imputation or should I should do it before the imputation and later replace it. I am also not sure how to do the replace if I have several imputations.
2. I used linear regression for continuous variable (regress) as the imputation procedure. Will if be a problem if I do multilevel modelling (using MLwiN) in the complete dataset?
3. Is it possible to check if the imputation was done correctly or not (continuous variable)?
4. All the imputed values for both height and weight are generated under the original file. Is it possible to match the imputed values?
5. Lastly, I came across REALCOM impute. Correct me if I am wrong, it is a package that can be installed in STATA so that another software can be used to impute data. Imputation can be done successfully without using REALCOM as well.
Thank you in advance.
Comment