Hi
I am following a weighting procedure to create a representative sample for multilevel analysis in Mehmetoglu & Jakobsen (2016) Applied Statistics Using Stata.
In short, my data set consists of observations for 29 countries over seven ESS rounds (pooled data) and I am using a two-level model (individual and contextual (country) level).
To weight the data correctly I follow the example in the book for two-level data analysis (conveniently with ESS data). Two weights are needed: design weight (used when not possible for all individuals in any given country to have same chance of being selected in survey) and sample weight (to account for varying sample size between countries). The latter is not in the ESS data set and must be constructed.
The book uses the following approach:
When I run the syntax I receive invalid syntax r(198) after
I assume this is because I have not defined _N-r(N) properly, and I wonder if anyone on Statalist can help me solve this problem.
Best
Tarjei W. Havneraas
I am following a weighting procedure to create a representative sample for multilevel analysis in Mehmetoglu & Jakobsen (2016) Applied Statistics Using Stata.
In short, my data set consists of observations for 29 countries over seven ESS rounds (pooled data) and I am using a two-level model (individual and contextual (country) level).
To weight the data correctly I follow the example in the book for two-level data analysis (conveniently with ESS data). Two weights are needed: design weight (used when not possible for all individuals in any given country to have same chance of being selected in survey) and sample weight (to account for varying sample size between countries). The latter is not in the ESS data set and must be constructed.
The book uses the following approach:
We assume that each unit (country) should have the same number of respondents: we need to adjust for this. The mean N is 48,487/25 = 1939.48. We now divide the mean by the N for each country and get the value for persons from that country on our sample weight:
We then multiply the design weight with our new sample weight:
By using designsample we have a representative sample from each country, and each Pole, Norwegian, and Russian counts equally in our model. It is also possible to use the same procedure with a weight that takes into account population size (e.g., the fact that there are more Germans in Europe than there are Danes), and the ESS has its own pre-coded variable for this purpose.
Code:
quietly levelsof cntry local numctry: word count `r(levels)’ quietly count if missing(cntry) local cntpercntry = (_N-r(N))/`numctry’ bys cntry: gen sample_weight = `cntpercntry’ / _N if !missing(cntry)
Code:
gen designsample=dweight*sample_weight
Code:
local cntpercntry = (_N-r(N))/`numctry'
Best
Tarjei W. Havneraas
Comment