Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • MI: question about mi estimate output

    Hi,

    I created an MI dataset using mi impute chained. I then recreated multiple scales within the imputed dataset, from individual variables that had been imputed, using mi passive.

    Now, I would like to analyze these data. The first time I run a model in this dataset (see example below), I get some notes at the top, about values in the imputations being updated to match values in m=0. I have never come across this note before. What does this mean/why would this happen? Is this something to be concerned about?

    The 5 variables in the note were all recreated within the MI dataset from other variables, but they also were not the only variables that I recreated within the MI dataset.

    Thanks,
    Robin

    :

    PHP Code:
    mi estimateregress ae_sex_mi b2.sexbase
    (3800 values of passive variable ae_sum_mi_z in m>0 updated to match values in m=0)
    (
    3800 values of passive variable ae_sex_mi_z in m>0 updated to match values in m=0)
    (
    3800 values of passive variable ae_inhibit_mi_z in m>0 updated to match values in m=0)
    (
    3800 values of passive variable cse_sum_mi_z in m>0 updated to match values in m=0)
    (
    3800 values of passive variable sri_sum_mi_z in m>0 updated to match values in m=0)

    Multiple-imputation estimates                   Imputations       =         25
    Linear regression                               Number of obs     
    =        205
                                                    Average RVI       
    =     0.0003
                                                    Largest FMI       
    =     0.0006
                                                    Complete DF       
    =        203
    DF adjustment
    :   Small sample                   DF:     min       =     200.92
                                                            avg       
    =     200.95
                                                            max       
    =     200.98
    Model F test
    :       Equal FMI                   F(   1,  201.0)   =       5.68
    Within VCE type
    :          OLS                   Prob F          =     0.0181

    ------------------------------------------------------------------------------
       
    ae_sex_mi |      Coef.   StdErr.      t    P>|t|     [95ConfInterval]
    -------------+----------------------------------------------------------------
             
    sex |
         
    Female  |  -3.106659    1.30376    -2.38   0.018    -5.677463    -.535855
           _cons 
    |   19.16413   .8494671    22.56   0.000     17.48912    20.83914
    ------------------------------------------------------------------------------ 

  • #2
    When you change the values of imputed variables (e.g., mean centering variables in each dataset, or create indices in each dataset), Stata will not allow the transformed values in each dataset to differ from their observed counterparts in the non-imputed dataset. Stata calls this super-varying variables (see MI, p. 385).

    In your situation, this is definitely something to be concerned about because it probably means that your scales do no longer contain the values that you want. One solution to the problem is to unregister the variables after imputation. That way, Stata's mi suite has no clue what the values in these variables are supposed to be. Be sure to store the imputations in flong or flongsep style.

    Best
    Daniel

    Comment


    • #3
      Hi Daniel,

      Thanks - this makes sense. These were variables (z-scores) that I had created within the imputed dataset. So, the values would be expected to vary across the non-missing observations.

      I converted my dataset to flong, and created the z-scores using mi xeq: egen instead this time. It seems to have worked, as I no longer get the note.

      Thanks again,
      Robin

      Comment


      • #4
        So to follow on from Robin's question - here "m=0" refers to which dataset? The unimputed data or the first imputed dataset? Do we also need to include the unimputed data in order to do -mi estimation - or do we just use the imputed dataset with whatever number of imputations are made?

        Thank you very much,
        Sifan

        Comment


        • #5
          in general, m=0 is the original, unimputed data set; you should be using -mi estimate- to get your estimates and it will take care of which "datasets" to use - this sounds as though you have not actually asked the question you are really interested in - you are more likely to get helpful answers if you ask the question of actual interest

          Comment

          Working...
          X