Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Variables chosen for multiple imputation

    Dear Statalist,

    I have a question regarding to choosing variables for multiple imputation. For example, I have missing values for smoking, and I'd like to investigate the relationship between smoking and cancer under control of age and sex during regression. There are also some variables that I'd like to adjusted e.g., education and occupation. If I want to obtain a crude OR only ajdusted for age and sex, and an adjusted OR adjusted for education and occupation as well, should I include all variables when imputing smoking for logistic regression on crude OR? Or should I just include age and sex because I will only ajdust for them for a crude OR? Thank you!

    Yue

  • #2
    You should use all variables that are relevant to the prediction of the missing values when you impute, regardless of whether those same variable all appear in the subsequent analyses. You can create a single multiply imputed data set using all relevant variables. Then you can use it for whatever analyses you like afterwards.

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      You should use all variables that are relevant to the prediction of the missing values when you impute, regardless of whether those same variable all appear in the subsequent analyses. You can create a single multiply imputed data set using all relevant variables. Then you can use it for whatever analyses you like afterwards.
      OK! Thank you very much, Clyde!

      Comment


      • #4
        Dear all,

        I have problems when runing the multiple imputation with chained equations.
        After run

        Code:
        mi set mlong     
        mi misstable sum    
        
        mi misstable patterns
        mi misstable nested
        mi register imputed  hta_0 smoking education hdl_0 estimated_ldl_0   
        mi register regular sex age diabetes PA energy alcoholg_0
        mi impute chained (logit) hta_0 (mlogit, augment) smoking education (regress) hdl_0 estimated_ldl_0 = age sex diabetes PA energy  alcoholg_0, add(20) rseed (1234)
        I obtain the following warning: the sets of predictors of the imputation model vary across imputations or iterations.

        Can I ignore this warning? Do I have to change the variables used for the imputation model?

        Smoking and education are categorical variables with three categories each one. I use the command "augment" because of the perfect prediction issue.


        Thank you.


        Nerea


        Comment

        Working...
        X