Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Weighted sample

    Hi everyone! GIve me an advice. I ran a survey and I collected data. What is the problem if I do not use svyset?
    My dataare not balanced with the national population. Is there a way to fix this issue? I created variables which are the weight for gender, education and age, but I am not able to use them in my ordered logit analysis simultaneously.

    The command I used is:
    ologit salary sex [aweight= sex_weight] age education, vce(robust)
    but if I add another weight, for instance [aweight=age_weight], it does not work.

    Any idea?

    Thank you very much for your help!

    Andrea

  • #2
    HI Andrea,

    I am not sure if your first question is asking (1) why use svyset instead of directly specifying the weight options for commands? OR (2) why use survey weights at all, and what are the ramifications for not weighting the data according to the complex sampling structure?

    To the first point, setting your data as svyset is a convenience, you set the data properly and then simply precede any eligible command with svy: and it will give you the properly weighted results. There are lots of logistical reasons to use svysey and svy but you can achieve the weighting simply by specifying the weight option for any eligible command.

    To the second point, failure to weight the data will give you insights into the observed/collected sample, and not the population you may want it to represent after proper re-weighting. It depends on your goal. Some would say if causality or associations are your goal, weighting likely won't matter. If you actually want to report things for the population for which the weights were derived, you need to weight the analysis properly.

    For the question of why you can't use multiple weights, this just isn't how Stata handles weights. You need to create a weight that jointly describes the weight for every strata defined by your weighting variables. In other words, if you want to weight by sex (2 levels) and education (lets say 4 levels), you will need a single weight variables that defines weights for the 8 unique sex-education strata. Then that single weight variable is used in analysis. This, of course, extends to including age or any other variables for which weights were derived.

    Comment


    • #3
      Thank you very much! My question was the first one!
      Talking about multple weights: how can I create these weights on Stata? It is not simply the product of each weight, isn't it?
      Thank you very much!!

      Comment


      • #4
        Yes, the new combined weights are the product of the individual weights. Is there a reason you are using analytic weights? Is this the correct weighting choice for you? I would think based on your description that you want to use sampling or probability weights (pweight).

        See this link for further discussion: https://www.cpc.unc.edu/research/tools/data_analysis/statatutorial/sample_surveys/weight_syntax

        If your weights are the probabilities of sampling, or the inverse of the sampling probability (sampling weights), you will want to use the pweight option.
        Code:
        * For probability weights
        regress y x1 x2 x3 [pw = 1/prob]
        
        * For sampling weights (1 / prob)
        gen samp_wt = 1 / prob
        regress y x1 x2 x3 [pw = samp_wt]

        Comment


        • #5
          Just to add to Matt's response in #2 about svyset. Standard errors for survey samples are based on variation between PSUs within strata. If you have with PSUs as well as weights and ignore svyset​​​​​​​, all your standard errors will be wrong.
          Steve Samuels
          Statistical Consulting
          [email protected]

          Stata 14.2

          Comment

          Working...
          X