survey data (pweight)

Anshul Anand

Join Date: May 2015

Posts: 113
#1

survey data (pweight)

29 May 2015, 08:15

Hello everyone. I did some table to show top ten occupation of the gender according to their 1) absolut number and 2) percentage. The codes for women (coded as 1 and men as 0) were:

Code:

egen n_female = total (gender==1), by(occupation) egen p_female = mean(gender), by(occupation) egen tag = tag(occupation) 1)gsort -tag -n_female gen order = _n su gender if gender==1, meanonly gen ppcumtotal_female = 100 * sum(n_female) / r(N) //shows the cumulative percentage of women from all women tabdisp order in 1/10, c(occupation n_female ppcumtotal_female p_female) 2) gsort -tag -p_female replace order = _n tabdisp order in 1/10, c(occupation p_female n_female)

The problem is now, that I realized today that I have to weight the survey data with pweight (The variable for that is "weight" and has for each observation another value), so I don't have to calculate the weight. So I may have to change in these codes some thing, but I don't no what to do really.

I would be very thankful for some help.

Last edited by Anshul Anand; 29 May 2015, 08:35.
Tags: None
Anshul Anand

Join Date: May 2015

Posts: 113
#2

29 May 2015, 14:41

I think the main problem is that "egen" does not support "svy" commands. But how can this problem be solved?
Comment
Stephen Jenkins

Join Date: Apr 2014

Posts: 1435
#3

29 May 2015, 15:19

There's nothing in your code above that requires svy as far as I can see. (You are not calculating SEs.) For calculating a total, above you coded

Code:

egen n_female = total (gender==1), by(occupation)

To get a weighted total, I think you could change it to

Code:

egen n_female = total ( weight*(gender==1) ), by(occupation)

with similar tricks elsewhere
Comment
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#4

30 May 2015, 19:29

To continue Stephen's code:

Code:

egen n_pop = total(weight*(gender!=.), by(occupation) gen p_female = n_female/n_pop

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment
Anshul Anand

Join Date: May 2015

Posts: 113
#5

01 Jun 2015, 10:45

thanks a lot for the help!! One question: How to deal with the cumulative percentage? Like this:

Code:

: svy: su gender if gender==1, meanonly gen pccumtotal_female = 100 * sum(n_female) / r(N)

where n_female is the weighted?
Comment
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#6

01 Jun 2015, 15:54

I have no idea what your problem is. Stephen and I created the weighted variables that you asked for, so that you could use them in your original code. Now you 1) add a svy command that has no connection to anything; and 2) you ask how to "deal" with a variable that presumably was working with the non-weighted variables (since you had already created the table). Please read FAQ Section 12.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment

Steve Samuels

Join Date: Mar 2014
Posts: 1786

03 Jun 2015, 20:59

I think I see your problem, but if you want only the top 10 occupations, you need to set the cumlative proportions starting with the largest. Here's code based on the auto data. Here "rep78" plays the roll of occupation and "foreign" is your "female". Set a macro "k" for number of categories you want.In the example I take k=2.

Code:

local k = 2 /* top ranking substitute 10 in your problem */
sysuse auto, clear

egen tag = tag(rep78)
gen  wt = trunk  // sample weight

/* Create weighted totals and means */
egen ttot = total(wt*foreign), by(rep78)
egen tn   = total(wt*(foreign!=.)), by(rep78)
gen  fmean = tt/tn

tempfile t1
save `t1', replace

keep if tag
keep  rep78 ttot fmean

/* top totals: highest to smallest */
gsort -ttot, gen(rankt)
/* cumulative total percentages, starting with largest */
gen cumpct = sum(ttot)
replace cumpct= 100*cumpct/cumpct[_N]

list  rankt ttot cumpct rep78

/* top means: highest to smallest */
gsort  -fmean, gen(rankm)
list rankm fmean rep78  if rankm<=`k'

/* Merge back with original data */
merge 1:m rep78 using `t1'
keep if _merge==3
drop _merge

/* sample analyses */

tabdisp rankt if rankt<=`k', c(rep78  cumpct ttot)

svyset _n [pweight = wt]
svy, subpop(if  rankm<=`k'):  reg price trunk i.rep78

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2

Comment

Anshul Anand

Join Date: May 2015

Posts: 113
#8

05 Jun 2015, 02:31

Thank you very much Sir!!
Comment

Announcement

survey data (pweight)

Comment

Comment

Comment

Comment

Comment

Comment

Comment