Dear Statausers,
I have panel data for 290 cities belonging to 21 regions. I have city-level data on cost shares and population size. Now, I want to know the regional average, but I want to weigh each city's cost share by its population size. The data looks something like this:
I have read the Collapse Manual (http://www.stata.com/manuals13/dcollapse.pdf) but I am not sure I am understanding it correctly. It says (p. 6) that "Weight normalization affects only the sum, count, sd, semean, and sebinomial statistics.". On p.7 in the manual, in example 4, an example of a weighted mean in a similar setting that I use, is shown, as following:
. collapse (mean) age income (median) medage=age medinc=income (rawsum) pop > [aweight=pop], by(region) Is it possible to do what I want using following code?
collapse (mean) cost_share [aweight=pop_city], by(region year)
It works in the sense that its provides me a region level variable which is different from when specifying
collapse (mean) cost_share, by(region year)
But I would just like to check with someone that it is actually performing what I am looking for.
Best regards,
Hanna L
I have panel data for 290 cities belonging to 21 regions. I have city-level data on cost shares and population size. Now, I want to know the regional average, but I want to weigh each city's cost share by its population size. The data looks something like this:
city | cost_share | pop_city | region | year |
1 | 10 | 100000 | NE | 1 |
2 | 15 | 10000 | NE | 2 |
3 | 5 | 5000 | SE | 1 |
4 | 8 | 6000 | SE | 2 |
I have read the Collapse Manual (http://www.stata.com/manuals13/dcollapse.pdf) but I am not sure I am understanding it correctly. It says (p. 6) that "Weight normalization affects only the sum, count, sd, semean, and sebinomial statistics.". On p.7 in the manual, in example 4, an example of a weighted mean in a similar setting that I use, is shown, as following:
. collapse (mean) age income (median) medage=age medinc=income (rawsum) pop > [aweight=pop], by(region) Is it possible to do what I want using following code?
collapse (mean) cost_share [aweight=pop_city], by(region year)
It works in the sense that its provides me a region level variable which is different from when specifying
collapse (mean) cost_share, by(region year)
But I would just like to check with someone that it is actually performing what I am looking for.
Best regards,
Hanna L
Comment