Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Imputation conditional on an imputed variable

    I have a data set that I am trying to perform multiple imputation with chained equations on, using -mi impute chained-. The data comes from a questionnaire that includes something called the "CAGE." It's a four yes-no item scale about drinking (alcohol). Before those four items, there is a "screener" that asks whether you drink alcohol at all. If you say no, you are supposed to skip the four items. If you say yes, you are supposed to answer them. Finally, the four items get scored by summing them. If a person answers no to the "screener" we are supposed set all four items (and the score) to 0. (A score >= 2 is considered indicative of problem drinking.)

    There are several dozen observations in the data set (overall N about 4,000) in which the "screener" was not answered and neither were any of the four items. While I personally think that the notion that these data can be considered missing at random is pretty ludicrous, considering the stigmatized content, my collaborators are intent on forging ahead with MI.

    The problem I have is that when I try:

    Code:
    mi register cage_screen cage1 cage2 cage3 cage4 imputed
    
    mi impute chained (logit, augment) cage_screen ///
        (logit, augment conditional(if cage_screen == 1)) cage1 cage2 cage3 cage4 ///
        other imputations... = regular variables, force
    Stata tells me that I can't use an imputed variable in specifying the conditional. And the manual does indeed say that's the case--you can only condition on a non-imputed variable.

    But I need to impute the missing values of the cage_screen and cage1-cage4, and it does not make sense to impute cage1-cage4 unless the imputed value of cage_screen == 1. (It's a bit like imputing answers to questions about pregnancy in a male. What do you do if gender is, itself an imputed variable?)

    How do I accomplish this? I appreciate any suggestions.

  • #2
    No solution, just some thoughts.

    Have you ever considered conditioning the imputation of the screening item on the four imputed cage values (or the score)? What makes you trust the imputed value for the screening item more than those of the four cage variables?

    To elaborate a little, consider the case where we impute conditional on the values of an observed variable. We do so because we can be certain that the imputed values just do not always make sense. In your situation, the variable we want to condition on is imputed itself, meaning that we cannot be certain about its "true" value. But if we do not know the true value, how can we know whether imputing other variables makes sense or not?

    If you really want to do this, impute all values and replace the cage variables with 0 after the imputation for cases in which the imputed screening value is 0.

    Best
    Daniel

    Comment


    • #3
      I believe that the "conditional" option to "ice" (Royston, et al., several articles in SJ) will handle this; the help file actually gives a gender/pregnancy example as in your parenthetical; use "search ice" to find (I know you know that but others might not)

      Comment


      • #4
        Has a solution been found since this message was posted? I have a somewhat similar problem with imputation of smoking status as well as the number of cigarettes smoked and I end up with an average number of cigarettes that is not quite zero among non-smokers. I therefore try to impute only the number of cigarettes among smokers (observed or imputed).

        Comment

        Working...
        X