Hi everyone,
I am using a public use file for some analyses of survey data, and I am trying to replicate the SUDAAN instructions for sample weights as part of my work using Stata.
Would anyone familiar with both software be able to comment about whether the commands below are equivalent?
Many thanks,
Emma
From Stata manual (SVY page 9):
svyset su1 [pweight=pw], strata(strata) fpc(fpc1)
Tentative Stata syntax:
use [file name]
svyset PUFPOPFAC [pweight=FACFNWT], strata(PUFSTRATA)
SUDAAN instructions:
I am using a public use file for some analyses of survey data, and I am trying to replicate the SUDAAN instructions for sample weights as part of my work using Stata.
Would anyone familiar with both software be able to comment about whether the commands below are equivalent?
Many thanks,
Emma
From Stata manual (SVY page 9):
svyset su1 [pweight=pw], strata(strata) fpc(fpc1)
Tentative Stata syntax:
use [file name]
svyset PUFPOPFAC [pweight=FACFNWT], strata(PUFSTRATA)
SUDAAN instructions:
Nest and Weight Variables | |||||||
Derived using estimation specifications | PUFSTRATA | Sample design variable (SUDAAN NEST variable) | 1-12 | 2302 | 100.0 | All | |
Derived using estimation specifications | PUFPOPFAC | Sample design variable (SUDAAN TOTCNT variable) | 611-9825 | 2302 | 100.0 | All | |
Derived using estimation specifications | FACFNWT | Weight for facility estimates (SAMPLE WEIGHT) | 1 - 35 | 2302 | 100.0 | All | |
Technical Notes on Nesting and Weight Variables | |||||||
The data collected are obtained through a complex, multistage sample design that involves stratification, clustering and oversampling of specific subgroups. The final weights provided for analytic purposes have been adjusted in several ways to yield valid national estimates for residential care facilities in the U.S. Researchers are reminded that the use of standard statistical procedures that are based on the assumption that data are generated via simple random sampling (SRS) generally will produce incorrect estimates of variances and standard errors when used to analyze data from the NSRCF. The clustering protocols that are used in the multistage selection of the NSRCF sample require other analytic procedures, as described below. Researchers who apply SRS techniques to NSRCF data generally will produce standard error estimates that are, on average, too small, and are likely to produce results that are subject to excessive Type I error. The nesting variable or sampling stage in the Facility PUF is PUFSTRATA. The sampling weight which represents each observation's contribution in the estimation of the current resident population is FACFNWT In the statistical software, SUDAAN, the analyst can use the design option WOR for sampling without replacement. When using SUDAAN for WOR, the variable for the TOTCNT statement in SUDAAN is PUFPOPFAC. Below are the SUDAAN statements for the NEST and TOTCNT statements for the Facility PUF with WOR. The example uses the crosstab procedure to illustrate. PROC CROSSTAB data= [file name] DESIGN=WOR; NEST PUFSTRATUM / MISSUNIT; TOTCNT PUFPOPFAC; WEIGHT FACFNWT; |