Thanks to Kit Baum, a new command, wgtdistrim, by Sebastian Lang and myself, is now available from the SSC.
wgtdistrim implements Potter's (1990) weight distribution approach to trim extreme sampling weights. The basic idea is that the sampling weights are assumed to follow a beta distribution. The parameters of the distribution are estimated from the moments of the observed sampling weights and the resulting quantiles are used as cut-off points for extreme sampling weights. The process is repeated a specified number of times (10 by default) or until no sampling weights are more extreme than the specified quantiles.
Here is an example, trimming the top 1 percent of the sampling weight in nhanes2f
Stata 16.1 or newer is required.
References:
Potter, F. J. 1990. A study of procedures to identify and trim extreme sampling weights. Proceedings of the Survey Research Methods Section of the American Statistical Association, 225--230.
http://www.asasrms.org/Proceedings/papers/1990_034.pdf
wgtdistrim implements Potter's (1990) weight distribution approach to trim extreme sampling weights. The basic idea is that the sampling weights are assumed to follow a beta distribution. The parameters of the distribution are estimated from the moments of the observed sampling weights and the resulting quantiles are used as cut-off points for extreme sampling weights. The process is repeated a specified number of times (10 by default) or until no sampling weights are more extreme than the specified quantiles.
Here is an example, trimming the top 1 percent of the sampling weight in nhanes2f
Code:
. webuse nhanes2f . wgtdistrim finalwgt , generate(double pw_t) upper(.01) Iteration 0: min = 2000 max = 79634 rel. diff = . Iteration 1: min = 2011.591 max = 38739.22 rel. diff = .5161157 Iteration 2: min = 2012.945 max = 37414.94 rel. diff = .0341834 Iteration 3: min = 2013.067 max = 37316.73 rel. diff = .0026248 Iteration 4: min = 2013.078 max = 37308.04 rel. diff = .0002329 Iteration 5: min = 2013.079 max = 37307.28 rel. diff = .0000206 Iteration 6: min = 2013.079 max = 37307.21 rel. diff = 1.82e-06 Iteration 7: min = 2013.079 max = 37307.2 rel. diff = 1.60e-07 Iteration 8: min = 2013.079 max = 37307.2 rel. diff = 1.41e-08 Iteration 9: min = 2013.079 max = 37307.2 rel. diff = 1.25e-09 Iteration 10: min = 2013.079 max = 37307.2 rel. diff = 1.10e-10 . summarize finalwgt pw_t Variable | Obs Mean Std. dev. Min Max -------------+--------------------------------------------------------- finalwgt | 10,337 11320.85 7304.457 2000 79634 pw_t | 10,337 11320.85 6999.602 2013.079 37307.2
Stata 16.1 or newer is required.
References:
Potter, F. J. 1990. A study of procedures to identify and trim extreme sampling weights. Proceedings of the Survey Research Methods Section of the American Statistical Association, 225--230.
http://www.asasrms.org/Proceedings/papers/1990_034.pdf
Comment