Correct weighting with grouped data in the CPS

Cortnie Shupe

Join Date: Dec 2015

Posts: 11
#1

Correct weighting with grouped data in the CPS

14 Feb 2017, 18:10

I am using Stata 14.1 and running regressions on CPS data that I have collapsed into weighted state-year averages. My question is with regard to the correct weighting in order to account not only for within state over- or undersampling of certain groups, but also different population counts across states in order to calculate average partial effects.

For simplification, say I want to run a regression of log wages on the minimum wage. The minimum wage variable comes from an external source and it is not weighted. It is a state-by-year variable that I have merged into my dataset.

I want to run the regression on state-year grouped data and therefore create weighted means with the following command:

collapse (mean) lwages [aw=earnwt], by(state year)

where earnwt is the individual weight variable to be used with analyses on wages and is calculated as 1/probability of being in the sample and lwages are individual logged wages

Then I merge in my minimum wage:

merge 1:1 state year using "${data}\mw.dta", keepusing(logmw)

where logmw is the minimum wage that does not vary by state and year and I run the following regression:

reg lwages logmw i.state i.year, cluster(state) robust

My question: the logged wages are weighted averages that take into account within state sampling differences. It seems though that I would need to still take into account the different population numbers across states in order to obtain consistent average partial effects. Would it suffice for me to create a second weight for the number of observations per state (below “population”) and then run the regression with frequency weights like in the example below or would I again need to account for the within-state weights, for instance the average earnwt by state-year cell multiplied by the number of observations per cell?

reg lwages logmw i.state i.year [fw=population], cluster(state) robust

Many thanks in advance if someone could give some guidance on this.
Tags: None

Announcement

Correct weighting with grouped data in the CPS