Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Multiple imputation with auxiliary variables

    Greetings,

    I'm running stata 15.1 on OSX. I have three ordinal variables relating to public spending preferences that are all strongly correlated. One of them is missing around a 103 observations, though. I'd like to impute these missing values but I'm not quite sure what I'm doing. Here's what I've done thus far:
    Code:
    mi set mlong
    Code:
    mi register imputed spending
    Code:
    mi impute ologit spending revchildcare revhealthspend, add(10) rseed(1234)
    I should note that I wasn't sure how many imputations to run, so I simply selected 10. I'm also unsure about the rseed #.
    Assuming I did everything correctly up to this point--what's the next step(s)? How do I generate the final imputed variable (which should have 1200 observations)? Thanks in advance for your help!

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double spending byte(revchildcare revhealthspend)
    7 7 7
    6 6 6
    2 2 2
    6 7 7
    1 1 1
    1 2 1
    7 7 7
    3 3 4
    1 1 1
    6 6 6
    2 4 1
    3 2 3
    5 4 6
    2 4 4
    3 5 4
    1 2 3
    6 7 7
    4 5 4
    1 1 1
    6 6 6
    5 5 6
    2 6 5
    7 5 6
    7 6 7
    5 5 5
    5 6 5
    2 4 4
    6 6 7
    7 7 7
    5 6 6
    . 7 7
    7 6 7
    1 4 4
    2 4 4
    5 5 5
    6 7 6
    7 5 5
    4 5 5
    3 4 4
    2 4 4
    6 4 6
    4 5 6
    3 3 5
    7 7 7
    6 6 6
    1 4 4
    5 4 4
    5 4 4
    7 6 7
    3 2 4
    5 4 4
    6 7 7
    3 4 4
    4 1 5
    3 4 4
    2 5 4
    2 4 4
    7 7 7
    3 6 3
    4 1 7
    1 1 2
    5 4 5
    7 6 7
    3 5 4
    7 7 7
    5 7 7
    6 6 6
    7 7 7
    1 3 3
    1 1 2
    3 6 6
    7 7 7
    6 7 7
    3 6 2
    4 6 5
    2 4 4
    4 5 6
    1 4 1
    1 4 4
    . 4 1
    3 4 4
    2 3 4
    6 6 6
    2 6 4
    4 1 1
    2 3 1
    7 7 7
    . 3 3
    3 1 1
    5 4 6
    7 7 7
    7 4 7
    1 1 1
    5 6 7
    7 6 6
    4 6 7
    1 2 2
    2 3 3
    4 5 6
    3 1 2
    end
    label values spending srv_spend
    label values revchildcare revchildcare
    label def revchildcare 1 "Decrease a great deal", modify
    label def revchildcare 2 "Decrease moderately", modify
    label def revchildcare 3 "Decrease a little", modify
    label def revchildcare 4 "No change", modify
    label def revchildcare 5 "Increase a little", modify
    label def revchildcare 6 "Increase moderately", modify
    label def revchildcare 7 "Increase a great deal", modify
    label values revhealthspend revhealthspend
    label def revhealthspend 1 "Decrease a great deal", modify
    label def revhealthspend 2 "Decrease moderately", modify
    label def revhealthspend 3 "Decrease a little", modify
    label def revhealthspend 4 "No change", modify
    label def revhealthspend 5 "Increase a little", modify
    label def revhealthspend 6 "Increase moderately", modify
    label def revhealthspend 7 "Increase a great deal", modify

  • #2
    You've already computed the imputed variable. In the -mlong- layout you have your original data, and then appended to that you will see that there are additional observations (which have _mi_m > 0) containing the imputed values of spending. You are ready to do whatever analysis it is you are planning. See -help mi estimate- for your next steps.

    As for whether 10 imputations will prove to be enough, it really depends on the fraction of missing information, a statistic that is related to, but not the same as, the fraction of observations with missing values on the variables being imputed. When you run whatever analysis you do run with -mi estimate-, part of the output will be a statistic labeled FMI. It will be a number between 0 and 1. The number of imputations you need should be approximately 100 times the FMI number. If you don't have enough, re-do the imputation to create more and then re-do the analysis.

    As for the seed, any number will do. The purpose is reproducibility more than anything else.

    Comment


    • #3
      Thanks Clyde. Do I need to run the -mi estimate- command each time a model uses the imputed variable or is it merely for diagnostic checks?

      Comment


      • #4
        When I remove the mi estimate command and run the -ologit- command my observation count is roughly 300 more than it should be. Can you explain?

        Comment


        • #5
          It is simply wrong to run -ologit- without the -mi estimate- prefix using the data with the imputations. You just get nonsense results. Once you have multiply imputed data, everything from that point on requires -mi estimate-. If you want to revert to the original unimputed data, run -mi extract 0-, and then you are back to the original data, the imputations being discarded. If you need to run some specific command(s) on the original data without the imputations, and don't want to lose the imputed data, you can prefix the specific command(s) with -mi xeq 0:-.

          Multiple imputation is complicated. It seems you are trying to use it without having laid the groundwork by reading up on it. You will get yourself in all kinds of difficulties that way. You must stop what you are doing and read the intro substantive and intro chapters of the MI volume of the PDF documentation that comes with your Stata. Then read the chapters on the specific commands you will be using. You will not be able to be productive with multiple imputation without learning that material.

          Comment


          • #6
            I agree with Clyde, you really need to do some reading first. Some other sources besides the manual:

            https://www.ssc.wisc.edu/sscc/pubs/stata_mi_intro.htm

            https://stats.idre.ucla.edu/stata/se...stata_pt1_new/

            https://www3.nd.edu/~rwilliam/xsoc73994/MD02.pdf



            -------------------------------------------
            Richard Williams, Notre Dame Dept of Sociology
            StataNow Version: 19.5 MP (2 processor)

            EMAIL: [email protected]
            WWW: https://www3.nd.edu/~rwilliam

            Comment

            Working...
            X