Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Russel, I apologize for misreading the statement on the Wisconsin web page and also for misunderstanding Marilena's original question.

    Steve
    Last edited by Steve Samuels; 31 Jul 2014, 11:16.
    Steve Samuels
    Statistical Consulting
    [email protected]

    Stata 14.2

    Comment


    • #17
      Dear all,

      Thank you for your replies!

      I would like to ask you however why you do think that adjusting for survey design is not useful. in my case the survey has oversampled ethnic minorities and disadvantaged groups, Pweights only cover attrition not item non-response which is what is causing my missing data.

      As for your suggestion do you mean separate by strata and PSU and then perform the mi chained?

      thank you for your help

      Comment


      • #18
        Hi Marilena,

        This issue of weighting an imputation model is not one I've looked into in any depth, so I'll leave the question of whether or not to do it to others. But if you want to run separate imputation models for each combination of strata and PSU (which is one of several suggestions I've seen for how to incorporate them into an imputation model) all you need to do is add by(strata PSU) to the end of your mi impute chained command.

        Russell Dimond
        Statistical Computing Specialist
        Social Science Computing Cooperative
        University of Wisconsin-Madison

        Comment


        • #19
          thank you Russell

          Comment


          • #20

            I had a theoretical reason for thinking that sample weighting of the imputation process was wrong: that prediction was intended for the particular sample, not for the population. But upon looking at the literature, I find I was wrong.


            MI variance estimators can be biased if survey weights are not used in the imputation model and sampling is "informative". The situation is worst for domain analyses, if the domain definition is not also a predictor in the model. A sampling domain is a non-stratum subgroup for which separate analyses are required; the Stata term is sub-populations.) This situation was first exposed for the case of estimating a domain mean by Kott, 1995. See the Introduction to Reist and Larsen (2012) for a brief summary.

            So, weights should be incorporated into the imputation model. However, weighting the model (e.g. weight option in mi impute), the solution for Kott's simplified problem, does not appear to be the best approach. Rather, the recommendation of Carpenter (2011) and others is to use the weights, first grouped, as main effect predictors and as components of interaction terms. A preferable alternative, if available, is to incorporate into the model other variables that determine the weights. For example, in the Georgia Reproductive Health Survey (Serbanescu, 2011), selection probabilities differed by geographical stratum and by number of females in the household eligible for the survey. These factors could enter directly into an imputation model.

            One approach to implementing Russell's suggestion is based on Reiter et al. (2006), who state:

            In some surveys the design may be so complicated that it is impractical to include dummy variables for every cluster. In these cases, imputers can simplify the model for the design variables, for example collapsing cluster categories or including proxy variables (e.g., cluster size) that are related to the outcome of interest.
            Thus, you could separately impute in subgroups formed by these variables.

            Steve


            References:

            Carpenter, James R. 2011. Multiple imputation with survey weights‚a bad idea?
            http://www.ccsr.ac.uk/qmss/seminars/...Carpenter.pdf.


            Kim, Jae Kwang, Brick Michael, J, Wayne A Fuller, and Graham Kalton. 2006. On the bias of the multiple-imputation variance estimator in survey sampling. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68, no. 3: 509-521.
            http://jkim.public.iastate.edu/2006_JRSSB.pdf


            Kott, PS. 1995. A paradox of multiple imputation. Proceedings of the Section on Survey Research Methods 384-389.
            http://www.amstat.org/sections/srms/...s/1995_064.pdf


            Reist, BM, and Larsen, MD. 2012. Post-Imputation Calibration Under Rubin’s Multiple Imputation Variance Estimator. Section on Survey Research Methods, Joint Statistical Meeting 3924-3934.
            https://www.amstat.org/sections/srms/proceedings/y2012/files/304603_73257.pdf

            Reiter, Jerome P, Trivellore E Raghunathan, and Satkartar K Kinney. 2006. The importance of modeling the sampling design in multiple imputation for missing data. Survey Methodology 32, no. 2: 143.

            http://publications.gc.ca/collection...02.pdf#page=29

            Serbanescu F, Egnatashvili V, Ruiz A, Suchdev D, Goodwin M (2011): Reproductive Health Survey Georgia, 2010 Summary Report. Division of Reproductive Health, Centers for Disease Control and Prevention (DRH/CDC) Atlanta, Georgia USA.
            Last edited by Steve Samuels; 06 Aug 2014, 08:00.
            Steve Samuels
            Statistical Consulting
            [email protected]

            Stata 14.2

            Comment


            • #21
              i need help with repeated-imputation inference program to work on survey of consumer finances. The code I have is
              rii , imp(Y): regress X1 X2 X3 X4 X101, robust

              Comment

              Working...
              X