Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How Should I Weight My Data

    Hello Everyone,

    I recently collected my own data that oversampled minority so I can do subgroup comparison, but I wonder what is the specific command for me to weight the data so that when I do analysis on the total sample the results will reflect the national population. Specifically, I have 500 Whites, 500 Blacks, 500 Asians, and 500 Latinos. I would like to weight it and make 68 percent white, 12 percent black, 5 percent Asian, and 14 percent Latino. How should I do that?

    Thank you in advance.

    Best,
    Kevin

  • #2
    Can anybody help me with this? I never deal with this issue before and the tutorials I read are really confusing.

    Comment


    • #3
      What you are asking for is poststratification weighting. So you will need a variable that designates the post-strata (in your case, a race variable) and you will need another variable that indicates the total population (not counts in your sample, but numbers in the target population) for each race. Those variables are used in the- poststrata()- and -postweight()- options to a -svyset- command. Then you use the -svy:- prefix with your analysis commands. There is a simple worked example in the online manual in the poststratification section (starts on page 54 of the [SVY] PDF manual if you are using the current version of Stata.)

      By the way, something is wrong with your race percentages as they do not add up to 100%. In any case, you won't be using the percentages for this: you will have to scale them up to counts for the population you are trying to standardize to.

      Comment


      • #4
        In the example below, I assume that Asians are 6% of the population so that the percentages add up to 100. I also assume that the total population of the country is 1 million.
        Code:
        set obs 2000
        gen group = 1
        replace group = 2 in 501/1000
        replace group = 3 in 1001/1500
        replace group = 4 in 1501/2000
        lab def group 1 "White" 2 "Black" 3 "Asian" 4 "Latino"
        lab val group group
        gen weight = .68*1000000/500 if group == 1
        replace weight = .12*1000000/500 if group == 2
        replace weight = .06*1000000/500 if group == 3
        replace weight = .14*1000000/500 if group == 4
        With these weights we get the frequencies below.
        Code:
        . tab group
        
              group |      Freq.     Percent        Cum.
        ------------+-----------------------------------
              White |        500       25.00       25.00
              Black |        500       25.00       50.00
              Asian |        500       25.00       75.00
             Latino |        500       25.00      100.00
        ------------+-----------------------------------
              Total |      2,000      100.00
        
        . tab group [fw=weight]
        
              group |      Freq.     Percent        Cum.
        ------------+-----------------------------------
              White |    680,000       68.00       68.00
              Black |    120,000       12.00       80.00
              Asian |     60,000        6.00       86.00
             Latino |    140,000       14.00      100.00
        ------------+-----------------------------------
              Total |  1,000,000      100.00
        Last edited by Friedrich Huebler; 18 Apr 2017, 23:15. Reason: Edit: math simplified.

        Comment


        • #5
          Thank you both!!!

          Comment

          Working...
          X