Weighting of European Social Survey data in Stata

Rob Mowry

Join Date: Apr 2014

Posts: 2
#1

Weighting of European Social Survey data in Stata

23 Apr 2014, 19:16

Greetings, I'm new to this forum and relatively new to Stata.

I am working with the European Social Survey round 1 (2002) in Stata. This data set was not originally intended for use in Stata, so I am struggling with the weighting. I will be combining data from countries and referring to average values, so I understand (from the ESS guide on weighting) that I will need to combine both the “design weight” [DWEIGHT] and the population size weight [PWEIGHT] into a new weight variable.

What I don’t understand: Stata user guide lists 4 kinds of weights, fweights, iweights, pweights, and iweights. When I type the command with the square bracket [weightword=exp], which weightword should I use?

Thanks very much in advance

Examples of the type of analyses I am interested in running:
tab dscrgrp ilglpst [weightword=exp], chi2 expected row

logit ilglpst eduyrs dscrgrp polintr [weightword=exp]
Tags: None
Brendan Halpin

Join Date: Mar 2014

Posts: 152
#2

24 Apr 2014, 01:55

In the ESS dweight weights for within country sampling features, while pweight scales countries to account for different sample/population ratios. If you want a pooled analysis (where, say, Germany will have a much bigger influence than Ireland, which will in turn have a much bigger influence than Luxembourg) I would experiment with the product of the weight variables, as an "analytical" weight: . gen wgt=dweight*pweight . (command) [aw=wgt] However, if you're bringing country in as an explanatory factor, I'd want to think carefully about what the weighting means, and perhaps use just dweight.
Comment
Brendan Halpin

Join Date: Mar 2014

Posts: 152
#3

24 Apr 2014, 05:20

That got a little garbled: somewhere hard line-breaks are being stripped.

Code:

gen wgt = dweight*pweight reg trstprl age i.gndr [aw=wgt] encode cntry, gen(country) reg trstprl c.age##i.country i.gndr [aw=dweight]
1 like
Comment
Richard Williams

Join Date: Apr 2014

Posts: 5043
#4

24 Apr 2014, 06:34

Here is a rather complicated explanation: http://www.europeansocialsurvey.org/...hting_data.pdf It sounds to me like, if multiple countries are involved, then the wgt variable computed by Brendan would be used. I am not sure why it would be aweights rather than pweights though. As Brendan says, there may be circumstances when, in comparing countries, only dweights should be used. Even if only analyzing one country, I think Brendan's wgt variable could be used. Anyway, my inclination is Brendan's wgt with pweights, but I have never worked with the data before so some expert may know otherwise.

-------------------------------------------
Richard Williams
Professor Emeritus of Sociology
University of Notre Dame
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://academicweb.nd.edu/~rwilliam/
Comment
Richard Williams

Join Date: Apr 2014

Posts: 5043
#5

24 Apr 2014, 06:35

Incidentally all of my line breaks are getting stripped too. At least when typing on my iPad.

-------------------------------------------
Richard Williams
Professor Emeritus of Sociology
University of Notre Dame
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://academicweb.nd.edu/~rwilliam/
Comment
Brendan Halpin

Join Date: Mar 2014

Posts: 152
#6

24 Apr 2014, 06:43

The pweight variable is constant within country, so there is no point in including it for single-country analyses. Re aweight vs pweight in the Stata command, Richard is right; I use aweight out of bad habit.
Comment
Rob Mowry

Join Date: Apr 2014

Posts: 2
#7

24 Apr 2014, 07:22

Thanks everyone! I will try using [pweight=wgt] after generating the new wgt variable.
Comment
Richard Williams

Join Date: Apr 2014

Posts: 5043
#8

24 Apr 2014, 09:59

Originally posted by Brendan Halpin View Post

The pweight variable is constant within country, so there is no point in including it for single-country analyses. Re aweight vs pweight in the Stata command, Richard is right; I use aweight out of bad habit.

What I meant is that, when only analyzing one country, dweight and the above computed wgt give the same results (I think). Since wgt should be used when more than one country is being analyzed, it seems to be you might as well always use wgt.

The other question is whether, in this case, it is better to add the [pw=wgt] option to every command or just to svyset the data. I think I will start a separate thread on that.

-------------------------------------------
Richard Williams
Professor Emeritus of Sociology
University of Notre Dame
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://academicweb.nd.edu/~rwilliam/
Comment

Announcement

Weighting of European Social Survey data in Stata

Comment

Comment

Comment

Comment

Comment

Comment

Comment