Weighting for representative sample with ESS data

Tarjei W. Havneraas

Join Date: Nov 2016

Posts: 136
#1

Weighting for representative sample with ESS data

26 Sep 2017, 18:24

Hi

I am following a weighting procedure to create a representative sample for multilevel analysis in Mehmetoglu & Jakobsen (2016) Applied Statistics Using Stata.

In short, my data set consists of observations for 29 countries over seven ESS rounds (pooled data) and I am using a two-level model (individual and contextual (country) level).

To weight the data correctly I follow the example in the book for two-level data analysis (conveniently with ESS data). Two weights are needed: design weight (used when not possible for all individuals in any given country to have same chance of being selected in survey) and sample weight (to account for varying sample size between countries). The latter is not in the ESS data set and must be constructed.

The book uses the following approach:

We assume that each unit (country) should have the same number of respondents: we need to adjust for this. The mean N is 48,487/25 = 1939.48. We now divide the mean by the N for each country and get the value for persons from that country on our sample weight:

Code:

quietly levelsof cntry local numctry: word count `r(levels)’ quietly count if missing(cntry) local cntpercntry = (_N-r(N))/`numctry’ bys cntry: gen sample_weight = `cntpercntry’ / _N if !missing(cntry)

We then multiply the design weight with our new sample weight:

Code:

gen designsample=dweight*sample_weight

By using designsample we have a representative sample from each country, and each Pole, Norwegian, and Russian counts equally in our model. It is also possible to use the same procedure with a weight that takes into account population size (e.g., the fact that there are more Germans in Europe than there are Danes), and the ESS has its own pre-coded variable for this purpose.

When I run the syntax I receive invalid syntax r(198) after

Code:

local cntpercntry = (_N-r(N))/`numctry'

I assume this is because I have not defined _N-r(N) properly, and I wonder if anyone on Statalist can help me solve this problem.

Best
Tarjei W. Havneraas

Last edited by Tarjei W. Havneraas; 26 Sep 2017, 18:42.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30119
#2

26 Sep 2017, 18:36

No, the problem is not with _N-r(N). The problem is with

Code:

`numctry’

That final quote character is incorrect. It should be the simple, straight, vertical quote character that is just to the right of the semicolon ( on a standard US keyboard. The curved or slanted quote character at the end is not correct. (The slanted opening quote at the beginning of numctry is correct.)
Comment
Tarjei W. Havneraas

Join Date: Nov 2016

Posts: 136
#3

26 Sep 2017, 18:45

I might have misspelled the last code. I have the correct character in my syntax and I corrected the post. Can the problem be that I need to contruct the _N, r or N (or all of them)?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30119
#4

26 Sep 2017, 18:52

There is no other error in the code you posted. I ran your code, changing nothing but that, using a dataset I have to provide a variable cntry. It ran without any errors noted:

Code:

. clear* . use oecd_countries . rename country cntry . quietly levelsof cntry . local numctry: word count `r(levels)’ . quietly count if missing(cntry) . local cntpercntry = (_N-r(N))/`numctry' . bys cntry: gen sample_weight = `cntpercntry' / _N if !missing(cntry)

That said, in light of #3, my guess is that what you posted in #1 is not the actual code you ran. You've already indicated that you "may have mistyped" that last character. Well, maybe you mistyped some others as well.

When you are asking for help with code here you should NEVER retype your code to show--for precisely this reason. You should copy it from the Results window, or your do-file, or your log file, and paste it into the Forum editor. That assures that what you show is exactly what you ran. There is no such thing as an insignificant detail: every little thing matters in programming.

Last edited by Clyde Schechter; 26 Sep 2017, 18:55.
Comment
Tarjei W. Havneraas

Join Date: Nov 2016

Posts: 136
#5

27 Sep 2017, 05:16

Thank you for your reply. I copy and pasted the code from my do-file but mistyped the latter code, which I will not do again. The code you provided worked fine for me as well, so I compared the syntax to identify the problem. Turns out the problem was here:

Code:

quietly levelsof cntry local numcrty: word count `r(levels)' quietly count if missing(cntry) local cntpercntry = (_N-r(N))/`numctry' bys cntry: gen sample_weight = `cntpercntry' / _N if !missing(cntry) gen designsample=dweight*sample_weight

Instead of typing

Code:

`r(levels)’

I typed

Code:

`r(levels)'
Comment
Chiara Allegri

Join Date: Jan 2018

Posts: 1
#6

08 Jan 2018, 04:14

Hi Tarej,
I am also working on ESS pooled dataset.

You mentioned that "sample weight is not in the ESS data set and must be constructed".

However, on the ESS file called "Weighting European Social Survey Data", I found:
"PWEIGHT: These weights correct for the fact that most countries taking part in the ESS have different population sizes but similar sample sizes"

Is this not the sample weight you are referring to?
Comment

Announcement

Weighting for representative sample with ESS data

Comment

Comment

Comment

Comment

Comment