Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Count Frequency of two observations are the same and equal to a certain value

    Hello, I am trying to count the frequency with which two people who are paired up and both choose the number 6 in q.
    In the example below, subject 73 and subject 116 are paired up for 10 rounds. I generated variable y to indicate that they are paired up. I would like to find the number of the times they both chose 6 in variable q. For example in round 2, 3, 7, 8, 9. How can I generate a dummy variable indicating this is the case please?

    Many thanks!

    *Example generated by -dataex-. For more info, type help dataex
    clear
    input float id byte(round id_in_group_cournot q) int y
    73 1 1 7 1
    116 1 2 6 1
    73 2 1 6 2
    116 2 2 6 2
    73 3 1 6 3
    116 3 2 6 3
    73 4 1 6 4
    116 4 2 7 4
    73 5 1 7 5
    116 5 2 6 5
    73 6 1 6 6
    116 6 2 7 6
    73 7 1 6 7
    116 7 2 6 7
    73 8 1 6 8
    116 8 2 6 8
    73 9 1 6 9
    116 9 2 6 9
    73 10 1 7 10
    116 10 2 8 10
    end


  • #2
    This is one of those unusual situations where a wide layout would work better.
    Code:
    by round (id), sort: assert _N == 2
    by round (id): gen seq = _n
    ds seq round, not
    reshape wide `r(varlist)', i(round) j(seq)
    by id1 id2 (round), sort: egen wanted = total(q1 == 6 & q2 == 6)
    Note: I assume in this code, that when you move to another pair of id's, the variable round does not restart at 1but continues in sequence*. If that is not the case, the first -assert- command will terminate execution with an error message. In that case do not proceed as the code will give incorrect results.

    *Or perhaps the variable y does that--in your example y is always the same as round. If that's the case, then use y instead of round throughout the above code. If you have no variable that identifies the pairings (as round appears to in the example), you need to create one. If you want help doing that, please post back with an example that contains several different pairings and represents the various ways in which the data switches from one pairing to another.
    Last edited by Clyde Schechter; 21 Mar 2025, 12:24.

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      This is one of those unusual situations where a wide layout would work better.
      Code:
      by round (id), sort: assert _N == 2
      by round (id): gen seq = _n
      ds seq round, not
      reshape wide `r(varlist)', i(round) j(seq)
      by id1 id2 (round), sort: egen wanted = total(q1 == 6 & q2 == 6)
      Note: I assume in this code, that when you move to another pair of id's, the variable round does not restart at 1but continues in sequence*. If that is not the case, the first -assert- command will terminate execution with an error message. In that case do not proceed as the code will give incorrect results.

      *Or perhaps the variable y does that--in your example y is always the same as round. If that's the case, then use y instead of round throughout the above code. If you have no variable that identifies the pairings (as round appears to in the example), you need to create one. If you want help doing that, please post back with an example that contains several different pairings and represents the various ways in which the data switches from one pairing to another.
      Thank you for your reply. Yes I do have a variable that identifies the pairings, which was y in my previous example, I have changed it to "pair_id" in the example below. The total rounds are 10. The example below have 40 pairs.


      * Example generated by -dataex-. For more info, type help dataex
      clear
      input float id byte(round id_in_pair q) int pair_id
      73 1 1 7 1
      116 1 2 6 1
      73 2 1 6 2
      116 2 2 6 2
      73 3 1 6 3
      116 3 2 6 3
      73 4 1 6 4
      116 4 2 7 4
      73 5 1 7 5
      116 5 2 6 5
      73 6 1 6 6
      116 6 2 7 6
      73 7 1 6 7
      116 7 2 6 7
      73 8 1 6 8
      116 8 2 6 8
      73 9 1 6 9
      116 9 2 6 9
      73 10 1 7 10
      116 10 2 8 10
      59 1 2 12 11
      106 1 1 8 11
      59 2 2 8 12
      106 2 1 6 12
      59 3 2 8 13
      106 3 1 9 13
      59 4 2 9 14
      106 4 1 7 14
      59 5 2 8 15
      106 5 1 12 15
      59 6 2 12 16
      106 6 1 0 16
      59 7 2 8 17
      106 7 1 12 17
      59 8 2 9 18
      106 8 1 12 18
      59 9 2 12 19
      106 9 1 12 19
      59 10 2 9 20
      106 10 1 12 20
      40 1 1 6 21
      99 1 2 9 21
      40 2 1 6 22
      99 2 2 9 22
      40 3 1 12 23
      99 3 2 8 23
      40 4 1 8 24
      99 4 2 7 24
      40 5 1 6 25
      99 5 2 9 25
      40 6 1 6 26
      99 6 2 8 26
      40 7 1 9 27
      99 7 2 9 27
      40 8 1 9 28
      99 8 2 8 28
      40 9 1 9 29
      99 9 2 9 29
      40 10 1 12 30
      99 10 2 8 30
      24 1 1 12 31
      36 1 2 6 31
      24 2 1 12 32
      36 2 2 6 32
      24 3 1 9 33
      36 3 2 12 33
      24 4 1 8 34
      36 4 2 12 34
      24 5 1 12 35
      36 5 2 12 35
      24 6 1 9 36
      36 6 2 12 36
      24 7 1 8 37
      36 7 2 12 37
      24 8 1 8 38
      36 8 2 12 38
      24 9 1 8 39
      36 9 2 9 39
      24 10 1 12 40
      36 10 2 12 40
      end

      Comment


      • #4
        OK, the variable pair_id is, I think, misnamed, because the same pair is associated with many different values of pair_id. Nevertheless, it does serve the purpose called for in #2. With this data in hand, it is just a simple modification of the earlier code:
        Code:
        by pair_id (round), sort: assert round[1] == round[_N]
        by pair_id (id), sort: assert _N == 2
        ds pair_id id_in_pair round , not
        reshape wide `r(varlist)', i(pair_id) j(id_in_pair)
        by id1 id2 (round), sort: egen wanted = total(q1 == 6 & q2 == 6)
        sort pair_id round
        order id2, after(id1)
        order round, after(pair_id)
        Again, if either of the -assert- commands at the beginning halts with an error message, then the data are not suitable for this code and you should not proceed. In that case, post back with new example data that reproduces whatever error you are encountering. (The first command verifies that round is consistent within pair_id, and the second verifies that each pair_id is truly a pair, with exactly two observations belonging to it.)

        One other caution when running this code. Because the -reshape- command draws on `r(varlist)', you cannot run this code one line at a time. You should run it uninterrupted from top to bottom. At the very least, it is critical that there be no interruption between the -ds pair_id..., not- command and the -reshape ...- command.

        Comment


        • #5
          You should also be able to do it without reshaping, simply with

          Code:
          bys pair_id: gen byte both_6 = (q[1] == 6 & q[2] == 6)
          egen wanted = total(both_6), by(id)
          drop both_6
          I am assuming here that any individual id appears in only one pairing (not pair_id but in a pair with only the same other id). This is true in your extract, but let me know if this is not the case in your overall dataset. One way to check that is to see if the assertion in the following code holds:

          Code:
          egen id_low = min(id), by(pair_id)
          egen id_hi = max(id), by(pair_id)
          bys id_low (id_hi): assert id_hi[1] == id_hi[_N]
          Last edited by Hemanshu Kumar; 21 Mar 2025, 23:36.

          Comment

          Working...
          X