Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Adjusting survey data for age and sex

    I had adjusted my survey data by age groups according to the 2010 US Census, but now I want to simultaneously adjust by sex. How would I go about creating the new standard age and sex adjusted weights?

    For an example of how I calculated the age-adjusted weight variable, I will include my code below. I used an age variable with 7 age groups. I also included the survey command I used to calculate prevalence estimates of my outcome variable applying the standardized weights.

    gen std_wgt10 = 0.220724 if age7cat==0
    replace std_wgt10 = 0.171133 if age7cat==1
    replace std_wgt10 = 0.185875 if age7cat==2
    replace std_wgt10 = 0.178898 if age7cat==3
    replace std_wgt10 = 0.124713 if age7cat==4
    replace std_wgt10 = 0.070752 if age7cat==5
    replace std_wgt10 = 0.047905 if age7cat==6

    svy: mean cms ,stdize(age7cat) stdweight(std_wgt10)


  • #2
    The -stdize()- option only accepts a single variable. So you have to create a combined age-sex variable and corresponding weights for it. This entails that you must already have a weight variable for sex adjustment. I'll call it std_wgt_sex

    Code:
    egen age_sex = group(age7cat sex)
    gen combined_weight = std_wgt10*std_wgt_sex
    svy: mean cms, stdize(age_sex) stdweight(combined_weight)

    Comment


    • #3
      Since Laura has census data, and she can create weights directly:

      Code:
      //===================== "create" example data
      // you obviously don't have to do this step, as you already have the data
      // "create" a (silly) census
      clear all
      set seed 583275935
      set obs 1000
      gen female = runiform() < .5
      gen agecat = ceil(10*runiform())*10
      tempfile census sample
      save `census'
      
      // draw a sample
      bsample 100
      save `sample'
      
      //========================== create weights
      use `census', clear
      
      // number of nonmissing observations in census
      count if !missing(female, agecat)
      tempname ncensus nsample
      scalar `ncensus' = r(N)
      
      // create a table of number of observation per age and gender
      contract female agecat, freq(census_freq)
      tempfile table
      save `table'
      
      use `sample'
      // count the number of observations per age and gender in sample
      bys female agecat : gen obs_freq = _N if !missing(female,agecat)
      
      // count number of nonmissing observations in sample
      count if !missing(female, agecat)
      scalar `nsample' = r(N)
      
      // merge in the counts from the census
      merge m:1 female agecat using `table'
      
      // create weights
      gen w = census_freq/obs_freq * `nsample'/`ncensus'
      ---------------------------------
      Maarten L. Buis
      University of Konstanz
      Department of history and sociology
      box 40
      78457 Konstanz
      Germany
      http://www.maartenbuis.nl
      ---------------------------------

      Comment


      • #4
        Thank you Clyde and Maarten! I wasn't sure how to correctly combine the age and sex variable. Now I am able to complete the analysis.

        Comment


        • #5
          Hi everyone,

          the command <svy: mean cms ,stdize(age7cat) stdweight(std_wgt10)> reports the adjusted mean in a dataset adjusted by age groups, but how can I obtain the percentiles and the SD in the same example?

          Thanks, best, John
          Last edited by John Lure; 13 Aug 2018, 16:15.

          Comment


          • #6
            Welcome to Statalist, John. To improve your chances of a good answer, read FAQ 12 and do everything it says. To have readable questions, paste code and results from your log or screen between code delimiters [CODE] and [/CODE]. into the forum editor. See the answers above.

            Here you refer to an unidentified example and ask for for percentiles and SDs, but of what? Use dataex, described in the FAQ, and provide an extract of data with just enough variables to illustrate your problem or use one of the datasets available from within Stata.
            Code:
             help datasets
            I woujld also suggest that you not attach your questions to old threads, unless they are followups on that thread's topic. Here you are apparently asking about a completely different. topic.
            Last edited by Steve Samuels; 13 Aug 2018, 23:11.
            Steve Samuels
            Statistical Consulting
            [email protected]

            Stata 14.2

            Comment


            • #7
              Ah.. in the light of morning, I see that the statistics you want are for the variable "cms". To get the estimated population standard deviation and percentiles, you'll need to create a post-stratification weight to feed into svyset. The code below uses Nick Winter's survwgt poststratify package (SSC) to do this. After svysetting the data with the new weight, svy: mean is run, followed by estat sd. Selected percentiles are found with the _pctile command (part of percentile) and with summarize, detail, with aweights. I have Stata 14; Some of this may be unnecessary in Stata 15.

              Code:
              .  set seed 400912
              
              .  use http://www.stata-press.com/data/r14/hbp, clear
              
              .  egen group = group(age race sex) if inlist(year, 1990, 1992)
              (675 missing values generated)
              
              .  gen selwt = 100+ runiform()*1000
              
              .  gen bp = 100+rnormal(0,5)
              
              .  by group, sort: gen stdw = _N
              
              .  svyset _n [pw = selwt]
              
                    pweight: selwt
                        VCE: linearized
                Single unit: missing
                   Strata 1: <one>
                       SU 1: <observations>
                      FPC 1: <zero>
              
              .. /* Standardized Mean */
              .  svy: mean bp,  stdize(group) stdweight(stdw)
              (running mean on estimation sample)
              
              Survey: Mean estimation
              
              Number of strata =       1        Number of obs   =        455
              Number of PSUs   =     455        Population size =  273,072.4
              N. of std strata =      24        Design df       =        454
              
              --------------------------------------------------------------
                           |             Linearized
                           |       Mean   Std. Err.     [95% Conf. Interval]
              -------------+------------------------------------------------
                        bp |   99.90081   .2600828       99.3897    100.4119
              --------------------------------------------------------------
              
              .
              . /* Use survgt package to compute  post-stratified weights */
              .
              . cap ssc install survwgt, replace
              
              . survwgt poststratify  selwt, by(group) totvar(stdw) gen(adjwt)
              
              .  svyset _n [pw = adjwt]
              
                    pweight: adjwt
                        VCE: linearized
                Single unit: missing
                   Strata 1: <one>
                       SU 1: <observations>
                      FPC 1: <zero>
              
              .  svy: mean bp
              (running mean on estimation sample)
              
              Survey: Mean estimation
              
              Number of strata =       1        Number of obs   =      1,130
              Number of PSUs   =   1,130        Population size =      1,130
                                                Design df       =      1,129
              
              --------------------------------------------------------------
                           |             Linearized
                           |       Mean   Std. Err.     [95% Conf. Interval]
              -------------+------------------------------------------------
                        bp |   99.79771   .1727626      99.45874    100.1367
              --------------------------------------------------------------
              
              .  estat sd
              
              -------------------------------------
                           |       Mean   Std. Dev.
              -------------+-----------------------
                        bp |   99.79771    5.191779
              -------------------------------------
              
              .  _pctile  bp [pw = adjwt], percentiles(5 10 25 50 75 90)
              
              .  return list
              
              scalars:
                               r(r1) =  91.30558013916016
                               r(r2) =  93.28947448730469
                               r(r3) =  96.23242950439453
                               r(r4) =  99.72535705566406
                               r(r5) =  103.0582885742188
                               r(r6) =  106.5712661743164
              
              .  summarize bp [aw =adjwt], det
              
                                           bp
              -------------------------------------------------------------
                    Percentiles      Smallest
               1%     87.58885       83.87115
               5%     91.30558         85.007
              10%     93.28947       85.21439       Obs               1,130
              25%     96.23243       85.26249       Sum of Wgt.       1,130
              
              50%     99.72536                      Mean           99.79771
                                      Largest       Std. Dev.      5.191779
              75%     103.0583       113.7785
              90%     106.5713       113.7878       Variance       26.95456
              95%     108.4101       114.5298       Skewness       .0106384
              99%      112.118       117.9779       Kurtosis       3.039936
              Last edited by Steve Samuels; 14 Aug 2018, 10:07.
              Steve Samuels
              Statistical Consulting
              [email protected]

              Stata 14.2

              Comment


              • #8
                Hi Steve, thank you very much for your advices; I'll follow them when using this Forum. And a big thank you for your answer. It totally solved my problem, best regards, John.

                Comment

                Working...
                X