Count frequencies of (binary) variable and merge these to original dataset

Castor Comploj

Join Date: Mar 2021

Posts: 92
#1

Count frequencies of (binary) variable and merge these to original dataset

16 Jun 2022, 01:56

The following provides one solution to the problem. However, it seems unnecessarily complicated to me. I would appreciate if fellow Stata users could suggest a simplification.
The dataset is the following:
ID communityID treat

1 1 1

2 1 1

3 1 0

4 1 1

5 2 0

6 2 1

Where I count the frequencies of (positive) treatment per group. I add these frequencies (treat==1) as a new variable (which is equal within each communityID).

preserve
contract communityID treat, zero nomiss
drop if treat==0 // IN ORDER TO HAVE m:1 MERGE
save treat.dta, replace
restore
merge m:1 communityID using treat.dta

The resulting data NRPS.dta then contains the collapsed information
communityID treat _freq

1 1 3

1 0 1

2 1 1

2 0 1

What is the best way of solving this problem without saving and merging data?
Tags: None
Ulrich Wohak

Join Date: Jun 2016

Posts: 25
#2

16 Jun 2022, 02:01

you could just create a counter and then collapse you data set, for example:

Code:

gen _freq = 1 collapse (sum) _freq, by(communityID treat)
Comment
Castor Comploj

Join Date: Mar 2021

Posts: 92
#3

16 Jun 2022, 03:02

This does not solve the same problem. I apologize if I was unclear above.

My intention is to have the count of treat==1 (the number of treated individuals), for each communityID, as a constant value (for each communityID, no matter if treat==0 or treat==1) in a new column of the original dataset.

I do not want a collapsed dataset.
Comment
Ulrich Wohak

Join Date: Jun 2016

Posts: 25
#4

16 Jun 2022, 03:08

I'm a bit confused. Is the second data you shared your desired output or not? It would be helpful if you could show us what the desired data set should look like.
Maybe this will work for you:

Code:

bysort communityID: egen _freq = total(treat)

or similarly

Code:

egen _freq = total(treated), by(communityID)
2 likes
Comment
Castor Comploj

Join Date: Mar 2021

Posts: 92
#5

17 Jun 2022, 03:22

The first solves it.
Comment

ID	communityID	treat
1	1	1
2	1	1
3	1	0
4	1	1
5	2	0
6	2	1

communityID	treat	_freq
1	1	3
1	0	1
2	1	1
2	0	1

Announcement

Count frequencies of (binary) variable and merge these to original dataset

Comment

Comment

Comment

Comment