I received the following private query, edited:
It's an interesting question, and I replied that I would answer if it were asked on Statalist, I've decided not to wait. (The post referred to was at: http://www.stata.com/statalist/archi.../msg01516.html). The bottom line answer is: use the household weights, and use svy: ratio
One can create a new "individual" weight = hh weight x no. adults, then use svy: mean. However this is unnecessary and undesirable. The following code shows the two approaches.
Now a version that utilizes a weight equal to HH weight x HH size:
The results are same as the those from svy: ratio.
The second version is not only unnecessary but also undesirable: First, it requires the creation of two extra variables and one extra svyset statement; Second, the analyst will have to explain that the new weight is not a real per-person weight. (If the the study did have incomes for individuals, each would get the household weight, as I said in the earlier post.)
Dear Sir:
I’ve got your e-mail address from Statalist. May I ask you one question. I found your explanation about sampling weights somewhere at Statalist (it was as follows:
Each person in a selected household shares the selection probability of the household and gets the same sampling weight.
I have one observation per household, with sampling weights at household level, and I’ve to report the results at individual (specifically per adult equivalent expenditure per annum), but I only have household total expenditure. I’ve found some suggestions to multiply sampling weights with household sizes to analyze the household expenditure data at individual level. May I know your above suggestion is in line with other suggestions.
I have confused that whether I should directly use sampling weights given (believe to be household weights) with svy command or I should use individual weights / population weights.
I’ve got your e-mail address from Statalist. May I ask you one question. I found your explanation about sampling weights somewhere at Statalist (it was as follows:
Each person in a selected household shares the selection probability of the household and gets the same sampling weight.
I have one observation per household, with sampling weights at household level, and I’ve to report the results at individual (specifically per adult equivalent expenditure per annum), but I only have household total expenditure. I’ve found some suggestions to multiply sampling weights with household sizes to analyze the household expenditure data at individual level. May I know your above suggestion is in line with other suggestions.
I have confused that whether I should directly use sampling weights given (believe to be household weights) with svy command or I should use individual weights / population weights.
It's an interesting question, and I replied that I would answer if it were asked on Statalist, I've decided not to wait. (The post referred to was at: http://www.stata.com/statalist/archi.../msg01516.html). The bottom line answer is: use the household weights, and use svy: ratio
One can create a new "individual" weight = hh weight x no. adults, then use svy: mean. However this is unnecessary and undesirable. The following code shows the two approaches.
Code:
clear sysuse auto, clear /* Set up household data with "turn" as the expenditure variable*/ gen hhwt = trunk gen hhid = substr(make,1,2) egen hhexp = total(turn), by(hhid) /* household adult expenditures*/ egen hhsize = count(turn), by(hhid) /* number of . adults */ bys hhid: keep if _n==1 /* one observation per hh */ /* svyset */ svyset hhid [pweight= hhwt] /* Estimate average expenditure per person */ svy: ratio av_adult_exp: hhexp/hhsize Number of strata = 1 Number of obs = 23 Number of PSUs = 23 Population size = 976 Design df = 22 -------------------------------------------------------------- | Linearized | Mean Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ av_adult_exp | 40.20799 .7347049 38.68431 41.73168
Code:
/* Four Statements if you want to revise weight */ gen av_adult_exp = hhexp/hhsize /*HH average */ gen new_wt = hhwt*hhsize svyset hhid [pweight= new_wt] svy: mean av_adult_exp
The second version is not only unnecessary but also undesirable: First, it requires the creation of two extra variables and one extra svyset statement; Second, the analyst will have to explain that the new weight is not a real per-person weight. (If the the study did have incomes for individuals, each would get the household weight, as I said in the earlier post.)
Comment