sum, d and histogram with svy:

Karl Yesler

Join Date: Jul 2014

Posts: 103
#1

sum, d and histogram with svy:

24 Sep 2019, 22:25

Hello Statalist,

I would like to run some descriptives of some model variables for subsequent modelling in a dataset that uses complex survey design with probability weights. In particular, I would like to see the skewness, kurtosis, and quantiles of a variable along with a histogram of the results. Normally, I would use

Code:

sum y, d

Code:

histogram y, normal

What are the equivalent commands after svyset?

BTW (I hope it's ok to ask a related question) when I plot the histogram using fweight (because pweight is not available), to attempt to do this, I find that the total AUC of the normal density curve can be much greater or less than that of the actual distribution. This does not seem to make sense to me?

I am using Stata SE x64 ver 13.1 with Win 7 x64 and with 8 GB of ram.
Tags: None
David Radwin

Join Date: Mar 2014

Posts: 368
#2

25 Sep 2019, 18:16

That's because pweights and fweights are very different. From Stata's help:

fweights, or frequency weights, are weights that indicate the number of duplicated observations. pweights, or sampling weights, are weights that denote the inverse of the probability that the observation is included because of the sampling design.

Code:

help weights

For summarize, you can just use aweights and get the same results. In this application, they yield equivalent results (but beware that this is not true for all applications).

histogram doesn't accept aweights or pweights, but you might try kdensity y, normal instead.

David Radwin
Senior Researcher, California Competes
californiacompetes.org
Pronouns: He/Him
Comment
NJ JAIN

Join Date: Mar 2019

Posts: 6
#3

04 Nov 2019, 01:05

Hi, I am analyzing IHDS survey data from https://www.icpsr.umich.edu/icpsrweb/DSDR/studies/36151 and https://www.icpsr.umich.edu/icpsrweb...de.html#table3. The definition for WT variables is:

WT: Sample weight for the household; most useful and usually used in almost all analyses
FWT: Integer weight (truncated from WT) for routines that require integer weight

The help option in STATA yields this definition of weights:

pweights, or sampling weights, are weights that denote the inverse of the probability that the observation is included because of the sampling design.
fweights, or frequency weights, are weights that indicate the number of duplicated observations.

Should I use FWT for stset? If not, how do I convert FWT to Pweights?

Will it be correct to say - "svyset IDHH [pweight = FWT], strata(DISTRICT)"?

Any guidance is very much appreciated. Thanks.
Comment

Announcement

sum, d and histogram with svy:

Comment

Comment