Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using a variable for multiple imputation without imputing it

    Hi, I have perhaps a simple question: is it possible to include in a mi model, say, var1 (that has missings) to help impute var2, but without actually imputing var1 itself? Ie, I'd want to use var1 to help fill in the missing values for var2 but without filling in any missing values for var1.

    As an example:


    Code:
    webuse mheart5, clear
    mi set mlong                                                                   
    mi register imputed age bmi attack smokes hsgrad female                                                      
    mi impute chained (regress) age = attack smokes hsgrad female, add(5) rseed(100)
    The variable bmi is also in the dataset, and has missings--is there a way to include it to impute age but without imputing BMI?
    Last edited by Anne Todd; 05 Oct 2022, 14:06. Reason: Fixed weird formatting in code

  • #2
    Why would you want to do this?

    Comment


    • #3
      At risk of sounding very basic, I'm working on a project where someone is doing some modeling and basically asked for a few variables to be imputed, but to leave another variable not-imputed (race, for various reasons). So I could just leave race out of the model altogether, of course, but I figure if it has information to be useful in imputing the other variables, then perhaps it would be good to include?

      Editing to add an additional related question to make sure I understand what's going on with mi impute in the first place: it will use non-missing values from all variables in the model to impute missing observations, correct? So in this example...

      Code:
      mi impute chained (regress) age bmi = attack smokes hsgrad female, add(5) rseed(100)
      ...non-missing values of age are being used to help impute missing values of bmi, and non-missing values of bmi are being used to help impute missing values of age (in addition to all the vars on the right side of the = which have no missings, of course)?
      Last edited by Anne Todd; 05 Oct 2022, 15:44.

      Comment


      • #4
        You do not go into the various reasons you -- or someone -- have for not wanting to impute missing values in one variable but still using that variable as a predictor to impute missing values in another. There might well be some misunderstanding regarding the substantive, statistical, and technical concepts underlying multiple imputations. Without more information, we cannot really tell.

        If you -- or someone -- decide not to use the imputed values of a variable in the analyses, that might be reasonable. For example, not using the imputed values of the outcome variable might be reasonable if the imputation model did not include auxiliary variables. You would still impute all missing values to preserve the correlations among the variables. If one variable is predictive for imputing missing values in another the same applies vice-versa; that is how correlation works.

        An alternative approach might be to exclude observations with missing values on race altogether. Whether that is reasonable depends, among other aspects, on the assumptions about the unobserved values of race.

        Comment


        • #5
          Thanks daniel klein, I'd thought about excluding observations with missing values on race, but as you mention, we suspect that non-response to race is likely not random.

          Adding in the question I edited into the above post at the last second:



          Editing to add an additional related question to make sure I understand what's going on with mi impute in the first place: it will use non-missing values from all variables in the model to impute missing observations, correct? So in this example...

          Code:
          mi impute chained (regress) age bmi = attack smokes hsgrad female, add(5) rseed(100)
          ...non-missing values of age are being used to help impute missing values of bmi, and non-missing values of bmi are being used to help impute missing values of age (in addition to all the vars on the right side of the = which have no missings, of course)?

          Comment


          • #6
            Originally posted by Anne Todd View Post
            [mi impute chained] will use non-missing values from all variables in the model to impute missing observations, correct?
            Yes. And, it will use "imputed" values of variables to impute missing values in other variables; it's an iterative process. I put imputed into quotation marks because, strictly speaking, only the predicted values (those drawn from the posterior) at the last iteration are stored as (final) imputed values.


            Originally posted by Anne Todd View Post
            So in this example...

            Code:
            mi impute chained (regress) age bmi = attack smokes hsgrad female, add(5) rseed(100)
            ...non-missing values of age are being used to help impute missing values of bmi, and non-missing values of bmi are being used to help impute missing values of age (in addition to all the vars on the right side of the = which have no missings, of course)?
            Again, yes, but not only the non-missing values. After the first iteration/initialization, there are no longer any missing values in age or bmi. in the following iterations (10, by default) all values, including the previously "imputed" ones, are used in the conditional models.


            I recommend reading the manual entry "Multivariate imputation when a missing-data pattern is monotone" in mi impute monotone first, then moving on to the relevant sections in mi impute chained.


            Originally posted by Anne Todd View Post
            we suspect that non-response to race is likely not random.
            Not even conditional on other covariates? If not, this is a serious problem and the best solution is not clear.
            Last edited by daniel klein; 06 Oct 2022, 01:32. Reason: tried to link to the section in the manual; could not do it

            Comment


            • #7
              Thanks daniel klein, very helpful. I will read the manual entry you provided.
              Also, for anyone who may stumble across this in the future, I found this short video very useful for building intuition about MICE: https://www.youtube.com/watch?v=zX-pacwVyvU.

              Comment

              Working...
              X