How to use imputed analysis weight to svyset survey data

Ayalew Astatkie

Join Date: Sep 2014

Posts: 9
#1

How to use imputed analysis weight to svyset survey data

13 Sep 2014, 02:57

I am in the process of analyzing a survey data using Stata 12. In multivariable analysis, the item-missing data rate due to missing items in different variables reaches more than 10%. Hence, I did multiple imputation using chained equations. Among the imputed variables was the 'analysis weight'. However, afterwards, when I try to declare the survey design on the multiple-imputed data, Stata gives me this error message:

variable WEIGHT registered as passive. Registered and passive variables may not be used as the basis for mi svyset.

When I 'mi unregister' the weight variable, all imputed values of the weight variable are lost. How may I use the imputed weight variable in declaring the survey design of my dataset?

Last edited by Ayalew Astatkie; 13 Sep 2014, 03:04.
Tags: analysis weight, multiple imputation, survey data
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#2

13 Sep 2014, 20:18

It's unusual to have a missing sampling weight. Also, computation of sampling weights should not ordinarily require any information from the respondent. I can think of three immediate examples, aside from destruction of study forms, where weights might be missing.

• A single household member is to be selected by random sampling from among eligible members of a household. If information on the number of eligible residents is missing, it's impossible to compute the sampling weight. Imputation of household size might be possible.

• In one study, dwellings at the final sampling stage were selected systematically with random starts, but the number of starts and the sampling intervals were not recorded. The probability of selection in each village had to be estimated from external estimates of the number of households in the village, and these estimates did not always agree.

• In the US National Health and Examination Survey (NHANES), some lab tests are done on a sub-sample . Analyses of this subsample require a special weight which is missing for people who did not get those tests.

To fully address your problem, I'd need to know about how analysis (= probability?) weights came be missing in your study. What was the sampling design? How were weights to be calculated? Were sampling weights adjusted for non-response? Were they revised so that sample estimates matched known population totals for some characteristics? (Possible methods: post-stratification, "raking", calibration".")

As Maarten Buis recently wrote in another thread: "The use of full names (first and last) has a long tradition on this list. We believe that this has helped maintain a friendly and professional atmosphere on this list. This is the reason that the FAQ asks everybody to sign on using their full name. You can ask to change your login name using the Contact Us button at the bottom right."

I ask that you make this change.

Steve

Steven J Samuels
Consultant in Statistics
18 Cantine's Island
Saugerties NY 12477 USA
Voice: 1- 845-246-0774
Fax : 1- 206-202-478
[email protected]

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
1 like
Comment
Ayalew Astatkie

Join Date: Sep 2014

Posts: 9
#3

14 Sep 2014, 03:35

I used a two-stage stratified cluster sampling technique. In the calculation of the analysis weight, I considered the selection probabilities in the first and second stages of sampling. I also considered post-stratification weight based on the sex of the survey respondents. The missing weight values resulted due to two reasons: 1) Missing information on the 'stratum' and 'cluster' of some respondents in the first stage of sampling; 2) missing information on the post-stratification variable (i.e., SEX) for some respondents.
Comment
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#4

16 Sep 2014, 21:21

Thank you for re-registering with your full name, Ayalew. Welcome to Statalist!

The only easy solution to your problem that I can think of: generate a new variable equal to the imputed one and use that.

I have doubts about the validity of imputing weights, but I'll let others chime in on this question.

You have a more serious concern: the missing PSU and stratum information, which are needed for mi svyset. What you might do depends on why PSU and stratum are missing. How did this happen?

The conservative approach which accepts the maximum standard errors: create a new PSU consisting of all those who are missing PSU/Stratum, and a new stratum to contain the PSU. Then in your mi svyset command, include the option singleunit(centered).

A couple of thoughts.

1. If you impute the final analysis weight, then the post-stratification weight totals will no longer match the population totals. The distortion will be minor if the percentage with unknown sex is small.

2. A better approach might be to impute sex; then post-stratify sampling weights separately for each imputation replicate. See also: http://www.stata.com/statalist/archi.../msg00850.html.

Steve

Steven J Samuels
Consultant in Statistics
18 Cantine's Island
Saugerties NY 12477 USA
Voice: 1- 845-246-0774
Fax : 1- 206-202-478
[email protected]

Last edited by Steve Samuels; 16 Sep 2014, 21:23.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
1 like
Comment
Ayalew Astatkie

Join Date: Sep 2014

Posts: 9
#5

17 Sep 2014, 09:20

I am not sure if this is right but this way Stata accepted my imputed analysis weight in mi svyset. First, I generated a weight variable which is equal to the imputed analysis weight using mi passive: generate. Then I used mi unregister to 'unregister' the new weight variable, declared the survey design using mi svyset and re-registered the new variable as passive. After that I am able to run mi estimate with the survey design taken into account.

With regard to the missing PSU and stratum information, only three (about 0.24%) of the survey respondents have missing values. I felt that doesn't affect my results much. I tried to impute them but couldn't succeed because the imputation models couldn't converge.

On the post-stratification variable SEX, 29 (about 2.2%) of the survey respondents have missing value. And after imputation the proportional composition of the population by the post-stratification variable didn't change--viz. 74% males & 26% females before imputation and the same after imputation.

Last edited by Ayalew Astatkie; 17 Sep 2014, 09:30.
Comment
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#6

17 Sep 2014, 16:19

I think you've come up with an admirable solution. And I agree that with such a small percentage of affected observations, you need not be concerned about the missing design variables.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
1 like
Comment

Announcement

How to use imputed analysis weight to svyset survey data

Comment

Comment

Comment

Comment

Comment