Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • sum, d and histogram with svy:

    Hello Statalist,

    I would like to run some descriptives of some model variables for subsequent modelling in a dataset that uses complex survey design with probability weights. In particular, I would like to see the skewness, kurtosis, and quantiles of a variable along with a histogram of the results. Normally, I would use

    Code:
    sum y, d

    Code:
    histogram y, normal
    What are the equivalent commands after svyset?

    BTW (I hope it's ok to ask a related question) when I plot the histogram using fweight (because pweight is not available), to attempt to do this, I find that the total AUC of the normal density curve can be much greater or less than that of the actual distribution. This does not seem to make sense to me?


    Click image for larger version

Name:	hist_113_n_w_1.png
Views:	1
Size:	55.6 KB
ID:	1517777




    I am using Stata SE x64 ver 13.1 with Win 7 x64 and with 8 GB of ram.

  • #2
    That's because pweights and fweights are very different. From Stata's help:

    fweights, or frequency weights, are weights that indicate the number of duplicated observations. pweights, or sampling weights, are weights that denote the inverse of the probability that the observation is included because of the sampling design.
    Code:
    help weights
    For summarize, you can just use aweights and get the same results. In this application, they yield equivalent results (but beware that this is not true for all applications).

    histogram doesn't accept aweights or pweights, but you might try kdensity y, normal instead.
    David Radwin
    Senior Researcher, California Competes
    californiacompetes.org
    Pronouns: He/Him

    Comment


    • #3
      Hi, I am analyzing IHDS survey data from https://www.icpsr.umich.edu/icpsrweb/DSDR/studies/36151 and https://www.icpsr.umich.edu/icpsrweb...de.html#table3. The definition for WT variables is:

      WT: Sample weight for the household; most useful and usually used in almost all analyses
      FWT: Integer weight (truncated from WT) for routines that require integer weight

      The help option in STATA yields this definition of weights:

      pweights, or sampling weights, are weights that denote the inverse of the probability that the observation is included because of the sampling design.
      fweights, or frequency weights, are weights that indicate the number of duplicated observations.

      Should I use FWT for stset? If not, how do I convert FWT to Pweights?

      Will it be correct to say - "svyset IDHH [pweight = FWT], strata(DISTRICT)"?

      Any guidance is very much appreciated. Thanks.

      Comment

      Working...
      X