Weighted average using Collapse command

Hanna Lindstrom

Join Date: Apr 2017

Posts: 25
#1

Weighted average using Collapse command

10 Aug 2017, 00:59

Dear Statausers,

I have panel data for 290 cities belonging to 21 regions. I have city-level data on cost shares and population size. Now, I want to know the regional average, but I want to weigh each city's cost share by its population size. The data looks something like this:

city cost_share pop_city region year

1 10 100000 NE 1

2 15 10000 NE 2

3 5 5000 SE 1

4 8 6000 SE 2

I have read the Collapse Manual (http://www.stata.com/manuals13/dcollapse.pdf) but I am not sure I am understanding it correctly. It says (p. 6) that "Weight normalization affects only the sum, count, sd, semean, and sebinomial statistics.". On p.7 in the manual, in example 4, an example of a weighted mean in a similar setting that I use, is shown, as following:
. collapse (mean) age income (median) medage=age medinc=income (rawsum) pop > [aweight=pop], by(region) Is it possible to do what I want using following code?

collapse (mean) cost_share [aweight=pop_city], by(region year)

It works in the sense that its provides me a region level variable which is different from when specifying

collapse (mean) cost_share, by(region year)

But I would just like to check with someone that it is actually performing what I am looking for.

Best regards,
Hanna L
Tags: None
Jesse Wursten

Join Date: Jan 2016

Posts: 915
#2

10 Aug 2017, 03:20

In cases like these, I always create a small dummy dataset with easy to calculate values and test whether I get the result I want.

Code:

clear set obs 10 gen x = _n gen weight = runiformint(0,1) sum x [aweight = weight] collapse (mean) x [aweight = weight]
1 like
Comment
Hanna Lindstrom

Join Date: Apr 2017

Posts: 25
#3

10 Aug 2017, 08:19

Dear Jesse,
Thanks for your advice! I ran both the collapse and the egen code, respectively, on a smaller data set, and I saw straight away how they work.

Cheers,
-Hanna
1 like
Comment

city	cost_share	pop_city	region	year
1	10	100000	NE	1
2	15	10000	NE	2
3	5	5000	SE	1
4	8	6000	SE	2

Announcement

Weighted average using Collapse command

Comment

Comment