Hello,
I'm working on wage gaps, using a dataset with weights. I am using code by Mr. Cox as seen in "https://www.statalist.org/forums/forum/general-stata-discussion/general/1479555-problem-using-bysort-egen-mean-and-weight-together" .
sort year
bysort year: egen double den = total(weight) if GroupA==1
bysort syear: egen double weightedwageGroupA = total(hourlywage*weight) if GroupA==1
replace weightedwageGroupA =weightedwageGroupA/den if GroupA==1
drop den
sort year
bysort year: egen double den = total(weight) if GroupB==1
bysort syear: egen double weightedwageGroupB = total(hourlywage*weight) if GroupB==1
replace weightedwageGroupB =weightedwageGroupB/den if GroupB==1
drop den
// summarizing shows that code above worked, existing mean, sd,...
gen wagegap = (weightedwageGroupA / weightedwageGroupB) -1
"(300,000 missing values generated)"
There has to be a logic error somewhere. "Weight" is probability weight.
I was able to bypass this problem, but this only gets me mean wagegap without standard deviation:
egen meanweightedwageGroupA = mean(weightedwageGroupA)
egen meanweightedwageGroupB = mean(weightedwageGroupB)
gen wagegap = (weightedwageGroupA / weightedwageGroupB) -1
Thank you for your help in advance!
Best, Aron Mueller.
I'm working on wage gaps, using a dataset with weights. I am using code by Mr. Cox as seen in "https://www.statalist.org/forums/forum/general-stata-discussion/general/1479555-problem-using-bysort-egen-mean-and-weight-together" .
sort year
bysort year: egen double den = total(weight) if GroupA==1
bysort syear: egen double weightedwageGroupA = total(hourlywage*weight) if GroupA==1
replace weightedwageGroupA =weightedwageGroupA/den if GroupA==1
drop den
sort year
bysort year: egen double den = total(weight) if GroupB==1
bysort syear: egen double weightedwageGroupB = total(hourlywage*weight) if GroupB==1
replace weightedwageGroupB =weightedwageGroupB/den if GroupB==1
drop den
// summarizing shows that code above worked, existing mean, sd,...
gen wagegap = (weightedwageGroupA / weightedwageGroupB) -1
"(300,000 missing values generated)"
There has to be a logic error somewhere. "Weight" is probability weight.
I was able to bypass this problem, but this only gets me mean wagegap without standard deviation:
egen meanweightedwageGroupA = mean(weightedwageGroupA)
egen meanweightedwageGroupB = mean(weightedwageGroupB)
gen wagegap = (weightedwageGroupA / weightedwageGroupB) -1
Thank you for your help in advance!
Best, Aron Mueller.
Comment