Count Frequency of two observations are the same and equal to a certain value

Jinruii Pan

Join Date: Mar 2025

Posts: 3
#1

Count Frequency of two observations are the same and equal to a certain value

21 Mar 2025, 11:23

Hello, I am trying to count the frequency with which two people who are paired up and both choose the number 6 in q.
In the example below, subject 73 and subject 116 are paired up for 10 rounds. I generated variable y to indicate that they are paired up. I would like to find the number of the times they both chose 6 in variable q. For example in round 2, 3, 7, 8, 9. How can I generate a dummy variable indicating this is the case please?

Many thanks!

*Example generated by -dataex-. For more info, type help dataex
clear
input float id byte(round id_in_group_cournot q) int y
73 1 1 7 1
116 1 2 6 1
73 2 1 6 2
116 2 2 6 2
73 3 1 6 3
116 3 2 6 3
73 4 1 6 4
116 4 2 7 4
73 5 1 7 5
116 5 2 6 5
73 6 1 6 6
116 6 2 7 6
73 7 1 6 7
116 7 2 6 7
73 8 1 6 8
116 8 2 6 8
73 9 1 6 9
116 9 2 6 9
73 10 1 7 10
116 10 2 8 10
end
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#2

21 Mar 2025, 12:07

This is one of those unusual situations where a wide layout would work better.

Code:

by round (id), sort: assert _N == 2 by round (id): gen seq = _n ds seq round, not reshape wide `r(varlist)', i(round) j(seq) by id1 id2 (round), sort: egen wanted = total(q1 == 6 & q2 == 6)

Note: I assume in this code, that when you move to another pair of id's, the variable round does not restart at 1but continues in sequence*. If that is not the case, the first -assert- command will terminate execution with an error message. In that case do not proceed as the code will give incorrect results.

*Or perhaps the variable y does that--in your example y is always the same as round. If that's the case, then use y instead of round throughout the above code. If you have no variable that identifies the pairings (as round appears to in the example), you need to create one. If you want help doing that, please post back with an example that contains several different pairings and represents the various ways in which the data switches from one pairing to another.

Last edited by Clyde Schechter; 21 Mar 2025, 12:24.
Comment
Jinruii Pan

Join Date: Mar 2025

Posts: 3
#3

21 Mar 2025, 13:19

Originally posted by Clyde Schechter View Post

This is one of those unusual situations where a wide layout would work better.

Code:

by round (id), sort: assert _N == 2 by round (id): gen seq = _n ds seq round, not reshape wide `r(varlist)', i(round) j(seq) by id1 id2 (round), sort: egen wanted = total(q1 == 6 & q2 == 6)

Note: I assume in this code, that when you move to another pair of id's, the variable round does not restart at 1but continues in sequence*. If that is not the case, the first -assert- command will terminate execution with an error message. In that case do not proceed as the code will give incorrect results.

*Or perhaps the variable y does that--in your example y is always the same as round. If that's the case, then use y instead of round throughout the above code. If you have no variable that identifies the pairings (as round appears to in the example), you need to create one. If you want help doing that, please post back with an example that contains several different pairings and represents the various ways in which the data switches from one pairing to another.

Thank you for your reply. Yes I do have a variable that identifies the pairings, which was y in my previous example, I have changed it to "pair_id" in the example below. The total rounds are 10. The example below have 40 pairs.

* Example generated by -dataex-. For more info, type help dataex
clear
input float id byte(round id_in_pair q) int pair_id
73 1 1 7 1
116 1 2 6 1
73 2 1 6 2
116 2 2 6 2
73 3 1 6 3
116 3 2 6 3
73 4 1 6 4
116 4 2 7 4
73 5 1 7 5
116 5 2 6 5
73 6 1 6 6
116 6 2 7 6
73 7 1 6 7
116 7 2 6 7
73 8 1 6 8
116 8 2 6 8
73 9 1 6 9
116 9 2 6 9
73 10 1 7 10
116 10 2 8 10
59 1 2 12 11
106 1 1 8 11
59 2 2 8 12
106 2 1 6 12
59 3 2 8 13
106 3 1 9 13
59 4 2 9 14
106 4 1 7 14
59 5 2 8 15
106 5 1 12 15
59 6 2 12 16
106 6 1 0 16
59 7 2 8 17
106 7 1 12 17
59 8 2 9 18
106 8 1 12 18
59 9 2 12 19
106 9 1 12 19
59 10 2 9 20
106 10 1 12 20
40 1 1 6 21
99 1 2 9 21
40 2 1 6 22
99 2 2 9 22
40 3 1 12 23
99 3 2 8 23
40 4 1 8 24
99 4 2 7 24
40 5 1 6 25
99 5 2 9 25
40 6 1 6 26
99 6 2 8 26
40 7 1 9 27
99 7 2 9 27
40 8 1 9 28
99 8 2 8 28
40 9 1 9 29
99 9 2 9 29
40 10 1 12 30
99 10 2 8 30
24 1 1 12 31
36 1 2 6 31
24 2 1 12 32
36 2 2 6 32
24 3 1 9 33
36 3 2 12 33
24 4 1 8 34
36 4 2 12 34
24 5 1 12 35
36 5 2 12 35
24 6 1 9 36
36 6 2 12 36
24 7 1 8 37
36 7 2 12 37
24 8 1 8 38
36 8 2 12 38
24 9 1 8 39
36 9 2 9 39
24 10 1 12 40
36 10 2 12 40
end
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#4

21 Mar 2025, 14:03

OK, the variable pair_id is, I think, misnamed, because the same pair is associated with many different values of pair_id. Nevertheless, it does serve the purpose called for in #2. With this data in hand, it is just a simple modification of the earlier code:

Code:

by pair_id (round), sort: assert round[1] == round[_N] by pair_id (id), sort: assert _N == 2 ds pair_id id_in_pair round , not reshape wide `r(varlist)', i(pair_id) j(id_in_pair) by id1 id2 (round), sort: egen wanted = total(q1 == 6 & q2 == 6) sort pair_id round order id2, after(id1) order round, after(pair_id)

Again, if either of the -assert- commands at the beginning halts with an error message, then the data are not suitable for this code and you should not proceed. In that case, post back with new example data that reproduces whatever error you are encountering. (The first command verifies that round is consistent within pair_id, and the second verifies that each pair_id is truly a pair, with exactly two observations belonging to it.)

One other caution when running this code. Because the -reshape- command draws on `r(varlist)', you cannot run this code one line at a time. You should run it uninterrupted from top to bottom. At the very least, it is critical that there be no interruption between the -ds pair_id..., not- command and the -reshape ...- command.
Comment
Hemanshu Kumar

Join Date: Mar 2015

Posts: 1411
#5

21 Mar 2025, 23:04

You should also be able to do it without reshaping, simply with

Code:

bys pair_id: gen byte both_6 = (q[1] == 6 & q[2] == 6) egen wanted = total(both_6), by(id) drop both_6

I am assuming here that any individual id appears in only one pairing (not pair_id but in a pair with only the same other id). This is true in your extract, but let me know if this is not the case in your overall dataset. One way to check that is to see if the assertion in the following code holds:

Code:

egen id_low = min(id), by(pair_id) egen id_hi = max(id), by(pair_id) bys id_low (id_hi): assert id_hi[1] == id_hi[_N]

Last edited by Hemanshu Kumar; 21 Mar 2025, 23:36.
Comment

Announcement

Count Frequency of two observations are the same and equal to a certain value

Comment

Comment

Comment

Comment