Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Weighting of European Social Survey data in Stata

    Greetings, I'm new to this forum and relatively new to Stata.

    I am working with the European Social Survey round 1 (2002) in Stata. This data set was not originally intended for use in Stata, so I am struggling with the weighting. I will be combining data from countries and referring to average values, so I understand (from the ESS guide on weighting) that I will need to combine both the “design weight” [DWEIGHT] and the population size weight [PWEIGHT] into a new weight variable.

    What I don’t understand: Stata user guide lists 4 kinds of weights, fweights, iweights, pweights, and iweights. When I type the command with the square bracket [weightword=exp], which weightword should I use?

    Thanks very much in advance

    Examples of the type of analyses I am interested in running:
    tab dscrgrp ilglpst [weightword=exp], chi2 expected row

    logit ilglpst eduyrs dscrgrp polintr [weightword=exp]

  • #2
    In the ESS dweight weights for within country sampling features, while pweight scales countries to account for different sample/population ratios. If you want a pooled analysis (where, say, Germany will have a much bigger influence than Ireland, which will in turn have a much bigger influence than Luxembourg) I would experiment with the product of the weight variables, as an "analytical" weight: . gen wgt=dweight*pweight . (command) [aw=wgt] However, if you're bringing country in as an explanatory factor, I'd want to think carefully about what the weighting means, and perhaps use just dweight.

    Comment


    • #3
      That got a little garbled: somewhere hard line-breaks are being stripped.

      Code:
      gen wgt = dweight*pweight
      reg trstprl age i.gndr [aw=wgt]
      encode cntry, gen(country)
      reg trstprl c.age##i.country i.gndr [aw=dweight]

      Comment


      • #4
        Here is a rather complicated explanation: http://www.europeansocialsurvey.org/...hting_data.pdf It sounds to me like, if multiple countries are involved, then the wgt variable computed by Brendan would be used. I am not sure why it would be aweights rather than pweights though. As Brendan says, there may be circumstances when, in comparing countries, only dweights should be used. Even if only analyzing one country, I think Brendan's wgt variable could be used. Anyway, my inclination is Brendan's wgt with pweights, but I have never worked with the data before so some expert may know otherwise.
        -------------------------------------------
        Richard Williams, Notre Dame Dept of Sociology
        Stata Version: 17.0 MP (2 processor)

        EMAIL: [email protected]
        WWW: https://www3.nd.edu/~rwilliam

        Comment


        • #5
          Incidentally all of my line breaks are getting stripped too. At least when typing on my iPad.
          -------------------------------------------
          Richard Williams, Notre Dame Dept of Sociology
          Stata Version: 17.0 MP (2 processor)

          EMAIL: [email protected]
          WWW: https://www3.nd.edu/~rwilliam

          Comment


          • #6
            The pweight variable is constant within country, so there is no point in including it for single-country analyses. Re aweight vs pweight in the Stata command, Richard is right; I use aweight out of bad habit.

            Comment


            • #7
              Thanks everyone! I will try using [pweight=wgt] after generating the new wgt variable.

              Comment


              • #8
                Originally posted by Brendan Halpin View Post
                The pweight variable is constant within country, so there is no point in including it for single-country analyses. Re aweight vs pweight in the Stata command, Richard is right; I use aweight out of bad habit.
                What I meant is that, when only analyzing one country, dweight and the above computed wgt give the same results (I think). Since wgt should be used when more than one country is being analyzed, it seems to be you might as well always use wgt.

                The other question is whether, in this case, it is better to add the [pw=wgt] option to every command or just to svyset the data. I think I will start a separate thread on that.
                -------------------------------------------
                Richard Williams, Notre Dame Dept of Sociology
                Stata Version: 17.0 MP (2 processor)

                EMAIL: [email protected]
                WWW: https://www3.nd.edu/~rwilliam

                Comment

                Working...
                X