Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to use imputed analysis weight to svyset survey data

    I am in the process of analyzing a survey data using Stata 12. In multivariable analysis, the item-missing data rate due to missing items in different variables reaches more than 10%. Hence, I did multiple imputation using chained equations. Among the imputed variables was the 'analysis weight'. However, afterwards, when I try to declare the survey design on the multiple-imputed data, Stata gives me this error message:
    variable WEIGHT registered as passive. Registered and passive variables may not be used as the basis for mi svyset.
    When I 'mi unregister' the weight variable, all imputed values of the weight variable are lost. How may I use the imputed weight variable in declaring the survey design of my dataset?
    Last edited by Ayalew Astatkie; 13 Sep 2014, 03:04.

  • #2

    It's unusual to have a missing sampling weight. Also, computation of sampling weights should not ordinarily require any information from the respondent. I can think of three immediate examples, aside from destruction of study forms, where weights might be missing.

    • A single household member is to be selected by random sampling from among eligible members of a household. If information on the number of eligible residents is missing, it's impossible to compute the sampling weight. Imputation of household size might be possible.

    • In one study, dwellings at the final sampling stage were selected systematically with random starts, but the number of starts and the sampling intervals were not recorded. The probability of selection in each village had to be estimated from external estimates of the number of households in the village, and these estimates did not always agree.

    • In the US National Health and Examination Survey (NHANES), some lab tests are done on a sub-sample . Analyses of this subsample require a special weight which is missing for people who did not get those tests.

    To fully address your problem, I'd need to know about how analysis (= probability?) weights came be missing in your study. What was the sampling design? How were weights to be calculated? Were sampling weights adjusted for non-response? Were they revised so that sample estimates matched known population totals for some characteristics? (Possible methods: post-stratification, "raking", calibration".")


    As Maarten Buis recently wrote in another thread: "The use of full names (first and last) has a long tradition on this list. We believe that this has helped maintain a friendly and professional atmosphere on this list. This is the reason that the FAQ asks everybody to sign on using their full name. You can ask to change your login name using the Contact Us button at the bottom right."

    I ask that you make this change.

    Steve


    Steven J Samuels
    Consultant in Statistics
    18 Cantine's Island
    Saugerties NY 12477 USA
    Voice: 1- 845-246-0774
    Fax : 1- 206-202-478
    [email protected]



    Steve Samuels
    Statistical Consulting
    [email protected]

    Stata 14.2

    Comment


    • #3
      I used a two-stage stratified cluster sampling technique. In the calculation of the analysis weight, I considered the selection probabilities in the first and second stages of sampling. I also considered post-stratification weight based on the sex of the survey respondents. The missing weight values resulted due to two reasons: 1) Missing information on the 'stratum' and 'cluster' of some respondents in the first stage of sampling; 2) missing information on the post-stratification variable (i.e., SEX) for some respondents.

      Comment


      • #4
        Thank you for re-registering with your full name, Ayalew. Welcome to Statalist!


        The only easy solution to your problem that I can think of: generate a new variable equal to the imputed one and use that.

        I have doubts about the validity of imputing weights, but I'll let others chime in on this question.

        You have a more serious concern: the missing PSU and stratum information, which are needed for mi svyset. What you might do depends on why PSU and stratum are missing. How did this happen?

        The conservative approach which accepts the maximum standard errors: create a new PSU consisting of all those who are missing PSU/Stratum, and a new stratum to contain the PSU. Then in your mi svyset command, include the option singleunit(centered).


        A couple of thoughts.

        1. If you impute the final analysis weight, then the post-stratification weight totals will no longer match the population totals. The distortion will be minor if the percentage with unknown sex is small.

        2. A better approach might be to impute sex; then post-stratify sampling weights separately for each imputation replicate. See also: http://www.stata.com/statalist/archi.../msg00850.html.

        Steve


        Steven J Samuels
        Consultant in Statistics
        18 Cantine's Island
        Saugerties NY 12477 USA
        Voice: 1- 845-246-0774
        Fax : 1- 206-202-478
        [email protected]
        Last edited by Steve Samuels; 16 Sep 2014, 21:23.
        Steve Samuels
        Statistical Consulting
        [email protected]

        Stata 14.2

        Comment


        • #5
          I am not sure if this is right but this way Stata accepted my imputed analysis weight in mi svyset. First, I generated a weight variable which is equal to the imputed analysis weight using mi passive: generate. Then I used mi unregister to 'unregister' the new weight variable, declared the survey design using mi svyset and re-registered the new variable as passive. After that I am able to run mi estimate with the survey design taken into account.

          With regard to the missing PSU and stratum information, only three (about 0.24%) of the survey respondents have missing values. I felt that doesn't affect my results much. I tried to impute them but couldn't succeed because the imputation models couldn't converge.

          On the post-stratification variable SEX, 29 (about 2.2%) of the survey respondents have missing value. And after imputation the proportional composition of the population by the post-stratification variable didn't change--viz. 74% males & 26% females before imputation and the same after imputation.
          Last edited by Ayalew Astatkie; 17 Sep 2014, 09:30.

          Comment


          • #6
            I think you've come up with an admirable solution. And I agree that with such a small percentage of affected observations, you need not be concerned about the missing design variables.
            Steve Samuels
            Statistical Consulting
            [email protected]

            Stata 14.2

            Comment

            Working...
            X