Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Non integer weights + summary statistics for population survey data

    Hi all,

    I am new to the world of STATA (and fairly new to epidemiology/biostats). I have been tasked with analysing 10 years of data from a population survey looking at nutrition. Specifically I am comparing trends in consumption (volume) of sugar sweetened beverages over 10 years across different subgroups of our survey population. Was progressing okay until I discovered our weighting system involves non-integers (raked weighting). I am unable to use fweight or pweight with non integer weights. aweight does not appear to be the correct weight to use for survey data, and gives incorrect output (I think). This leaves me with iweight, which appears to generate correct weighted results for frequency tables (when compared to SPSS output). Reading around the issue though, iweight does not seem designed for this process, and is not going to be suitable for use with summary statistics.

    How should I approach survey data with non integer weights? I cant find a clear answer on this searching forums.

    (I will need to calculate mean consumption for each year for each subgroup, assess trends over time, and use test of trend to ensure statistical significance. I have around 150 subgroups)

    Thanks in advance

  • #2
    I'm not a frequent user of weights, but I'd note that the -regress- command permits all manner of weights as well as the -svy- prefix. The regression-predicted values for each combination of year and subpopulation would give its mean. Rather than "test for trend," I'd create confidence intervals for each combination of year and subpopulation. The -margins-command would be useful for generating the means and CIs. So, I'd suggest something like this, using a crude example based on the auto data:

    Code:
    sysuse auto, clear
    // Fake data
    set seed 44677
    gen sweet = price
    egen subpop = cut(mpg), group(3)
    gen year = 2000 + ceil(runiform() * 3)
    replace year = year + 2000
    tab2 year sub
    gen YourWeight = runiform()
    //
    // Two lines would do it.
    regress sweet i.subpop##i.year [iweight = YourWeight]
    margins i.subpop##i.year
    This may not be exactly what you want, but should offer a start. The subpop#year listing would give you means and CIs that I think are relevant. Note that the business of i. and ## involves use of Stata's factor variable notation, about which see -help fvvarlist-. The -means- command is an alternative here, but I think -regress- and -margins- will require less messing around with code.
    Last edited by Mike Lacy; 17 Jun 2019, 09:16. Reason: Put in "sweet" variable.

    Comment

    Working...
    X