Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to establish overlap of participant IDs when using the tab command

    Dear Statalist,

    I'm trying to find out how much overlap exists in the different responses to a question.
    For example, 91% do not perform an action when it is rational, while only 34% perform an action when it is rational to do so.

    How can I find out if the same participants (by their unique IDs) belong to both 91% and 34%?

    Thank you very much in advance.

  • #2
    I would appreciate a data example. I can't guess at your data structure here or your variable names.

    Comment


    • #3
      Hi Nick,

      Thank you very much for your reply. Here is the data example:

      Code:
      tab statusquo rational, column
      
      +-------------------+
      | Key               |
      |-------------------|
      |     frequency     |
      | column percentage |
      +-------------------+
      
                 |       rational
       statusquo |         0          1 |     Total
      -----------+----------------------+----------
               0 |        33        466 |       499
                 |     40.74      42.99 |     42.83
      -----------+----------------------+----------
               1 |        48        618 |       666
                 |     59.26      57.01 |     57.17
      -----------+----------------------+----------
           Total |        81      1,084 |     1,165
                 |    100.00     100.00 |    100.00
      This table shows rationality (a binary variable) and status-quo (also a binary variable) (ie when participants confirmed their decision instead of rethinking their answer): Perhaps, as an illustration of how to read this table: When it rational to leave the decision as is (second column of rational), 57.01% behave rationally and remain in the status quo. When it is not rational to leave the decision as is (rational = 0), 40.74% behave rationally and do not leave the decision as is (ie status-quo =0). Now my question is, how many of the same participants, as identified by their unique ID, are both in 0,0 and 1,1, or how many are both in 0,1 and 1,0?


      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input int CASE float(statusquo rational)
       888 1 1
       981 0 0
       553 0 1
       560 0 1
       471 1 1
       513 1 1
       563 1 1
      1341 1 0
       573 1 0
       524 0 1
       906 0 1
       526 0 1
      1291 0 1
       541 0 1
      1086 1 1
      1161 1 1
       667 0 1
       797 1 1
       594 0 1
       240 0 1
       915 1 1
       615 0 1
       741 0 1
      1351 0 1
       666 1 1
       244 1 1
      1295 1 1
       234 0 1
       482 0 1
      1272 1 1
       689 0 1
       462 1 1
      1165 1 0
       385 1 1
       446 0 1
       187 1 1
       395 1 1
       570 0 1
       760 1 1
       227 0 1
       168 1 1
       224 1 1
       631 1 1
      1254 1 1
      1173 1 1
       171 0 1
       970 0 1
       780 0 1
       649 0 1
       470 1 1
       586 1 1
       534 1 1
      1223 0 1
       375 1 1
       242 0 1
      1258 0 1
       776 1 1
       788 1 1
       740 1 1
       713 0 1
       974 0 1
       507 1 1
       907 1 1
       785 1 1
       232 1 1
       929 1 1
       228 1 1
       988 0 1
      1217 1 1
       500 0 0
       635 0 1
       583 1 1
      1328 1 1
       664 1 1
      1163 1 1
       318 1 1
       710 1 1
      1241 1 1
       723 0 1
       536 1 1
       196 0 1
      1207 0 1
       499 1 1
       489 1 1
       612 0 1
      1347 0 1
       670 1 1
       958 1 1
       475 0 0
       323 0 1
       724 1 1
      1175 0 1
      1255 0 1
       702 1 0
      1232 0 1
       540 0 1
       578 0 1
      1097 1 1
      1235 1 1
       365 1 1
      end
      Thank you very much for your assistance!

      Kind regards,
      Scott

      Comment


      • #4
        both in 0,0 and 1,1

        both in 0,1 and 1,0
        I interpret as answered by

        Code:
        count if statusquo == rational 
        
        count if statusquo != rational
        which should yield 651 and 514 respectively, although you'll get more if there are any missing values. Adding the condition

        Code:
        & !missing(statusquo, rational)
        will ignore missings.

        So, those counts are equivalent to adding cells on each diagonal of the table.

        Comment


        • #5
          Thank you, Nick! Are you saying that the 514 from statusquo != rational are reflected in the 651 from statusquo = rational?

          If not, may I ask how to relate this to the potential overlap of participants, i.e., how can I identify participants that are part of more than one of the four (1,1; 0,0; 1,0; 0,1) groups?

          I was hoping to get a statement like 20% (as an example) of the same individuals are in 0,1 as well as 1,0.

          Thank you so much for your assistance!

          Comment


          • #6
            Sorry, but I don't understand your questions. If two values are not equal, they can't also be equal. Also, there is no overlap between the subsets (0,0), (0,1), (1,0), (1,1). So, there is no information about frequencies that isn't in the two-way table.

            Comment


            • #7
              Dear Nick, Thank you for your reply. I'm sorry I did not clarify this: each participant faces four decisions, which I have laid out in the long format. Each decision is one row of data, so I have four rows per participant and, subsequently, four decisions to stay in the status-quo or not.
              Hence, the question from my supervisor to analyze whether some participants are irrational (stay when irrational to do so or leave when actually it's rational to stay) in some decisions but rational in others (stay when rational or leave when irrational to stay).

              I would be most grateful for your insights.
              Thank you in advance!

              Comment


              • #8
                So, how is that repetition manifested in your data structure? The only other variable in #3 is CASE and in that data example a check with

                Code:
                isid CASE
                shows that each occurs just once, and there is no repetition.

                Comment


                • #9
                  Hi Nick, Thank you for your reply. I don't know what happened with my dataex example, but when I closed and reloaded the data, this is what I got:

                  CASE is the unique participant identifier, round is not always 4 (as initially there were four decisions per individual) but some decisions were dropped as part of data cleaning.

                  Code:
                  * Example generated by -dataex-. For more info, type help dataex
                  clear
                  input int CASE byte round float(rational statusquo)
                   79 1 1 0
                   79 2 1 0
                   79 3 1 0
                   79 4 1 0
                   80 1 1 1
                   80 2 1 1
                   97 1 0 0
                   97 3 1 0
                   97 4 0 0
                  104 1 1 0
                  104 2 0 1
                  104 3 1 0
                  104 4 1 0
                  109 1 1 0
                  109 4 1 0
                  128 1 1 0
                  128 2 1 0
                  128 3 1 0
                  128 4 1 0
                  129 1 1 0
                  129 2 0 0
                  129 3 0 0
                  129 4 0 0
                  140 1 1 0
                  140 2 1 0
                  140 3 1 0
                  140 4 1 0
                  147 2 1 0
                  147 3 0 0
                  149 1 1 0
                  149 3 1 1
                  149 4 1 1
                  151 2 1 0
                  151 4 1 0
                  155 1 1 0
                  155 2 1 0
                  155 3 0 0
                  155 4 1 0
                  160 1 0 0
                  160 2 1 0
                  160 3 1 0
                  160 4 1 0
                  162 1 1 0
                  162 2 1 0
                  162 3 1 0
                  165 1 0 0
                  165 2 1 0
                  165 4 1 0
                  171 1 1 0
                  171 2 0 0
                  171 3 1 0
                  171 4 0 0
                  172 1 1 0
                  172 2 1 0
                  172 3 1 0
                  172 4 0 0
                  174 1 1 0
                  174 2 1 0
                  174 3 0 0
                  176 1 1 0
                  176 2 0 0
                  176 3 1 0
                  176 4 1 0
                  177 1 1 0
                  177 3 1 0
                  177 4 1 0
                  179 1 1 0
                  179 2 0 0
                  179 3 0 0
                  180 1 1 0
                  180 2 1 0
                  180 3 1 1
                  180 4 1 0
                  181 1 1 0
                  181 2 1 0
                  181 3 1 0
                  181 4 1 0
                  187 1 1 0
                  187 2 1 0
                  187 4 1 0
                  189 1 1 0
                  189 2 0 0
                  189 3 1 0
                  189 4 1 0
                  191 1 0 0
                  191 2 1 0
                  191 3 1 0
                  191 4 1 0
                  193 2 0 0
                  193 3 0 0
                  193 4 1 0
                  194 1 1 0
                  194 2 1 0
                  194 3 1 0
                  194 4 1 0
                  198 1 1 0
                  198 2 1 0
                  198 3 1 0
                  198 4 1 0
                  201 1 1 0
                  end
                  I'm very grateful for your assistance!
                  Kind regards,
                  Scott

                  Comment


                  • #10
                    Thanks for the detail. There are at least three possibilities.

                    1. Each case's history can be represented concisely as a string profile so case 79 is 10101010 (or 11110000 if you prefer). That raises the slightly alarming prospect of 2^8 = 256 distinct profiles, although they may not all occur, and they are unlikely to be equally frequent either.

                    Constructing such a profile is routine manipulation, but the details are spelled out at https://journals.sagepub.com/doi/pdf...36867X20909698

                    2. We can count how many combinations each case chose, as a measure of inconstency.

                    3. We can reduce the latter to staying the same or changing ever.

                    As you'll realise, there needs to be some care over what is done with cases with just 1, 2 or 3 rounds AND over whether you're counting observations or cases.

                    Some technique follows:

                    Code:
                    * Example generated by -dataex-. For more info, type help dataex
                    clear
                    input int CASE byte round float(rational statusquo)
                     79 1 1 0
                     79 2 1 0
                     79 3 1 0
                     79 4 1 0
                     80 1 1 1
                     80 2 1 1
                     97 1 0 0
                     97 3 1 0
                     97 4 0 0
                    104 1 1 0
                    104 2 0 1
                    104 3 1 0
                    104 4 1 0
                    109 1 1 0
                    109 4 1 0
                    128 1 1 0
                    128 2 1 0
                    128 3 1 0
                    128 4 1 0
                    129 1 1 0
                    129 2 0 0
                    129 3 0 0
                    129 4 0 0
                    140 1 1 0
                    140 2 1 0
                    140 3 1 0
                    140 4 1 0
                    147 2 1 0
                    147 3 0 0
                    149 1 1 0
                    149 3 1 1
                    149 4 1 1
                    151 2 1 0
                    151 4 1 0
                    155 1 1 0
                    155 2 1 0
                    155 3 0 0
                    155 4 1 0
                    160 1 0 0
                    160 2 1 0
                    160 3 1 0
                    160 4 1 0
                    162 1 1 0
                    162 2 1 0
                    162 3 1 0
                    165 1 0 0
                    165 2 1 0
                    165 4 1 0
                    171 1 1 0
                    171 2 0 0
                    171 3 1 0
                    171 4 0 0
                    172 1 1 0
                    172 2 1 0
                    172 3 1 0
                    172 4 0 0
                    174 1 1 0
                    174 2 1 0
                    174 3 0 0
                    176 1 1 0
                    176 2 0 0
                    176 3 1 0
                    176 4 1 0
                    177 1 1 0
                    177 3 1 0
                    177 4 1 0
                    179 1 1 0
                    179 2 0 0
                    179 3 0 0
                    180 1 1 0
                    180 2 1 0
                    180 3 1 1
                    180 4 1 0
                    181 1 1 0
                    181 2 1 0
                    181 3 1 0
                    181 4 1 0
                    187 1 1 0
                    187 2 1 0
                    187 4 1 0
                    189 1 1 0
                    189 2 0 0
                    189 3 1 0
                    189 4 1 0
                    191 1 0 0
                    191 2 1 0
                    191 3 1 0
                    191 4 1 0
                    193 2 0 0
                    193 3 0 0
                    193 4 1 0
                    194 1 1 0
                    194 2 1 0
                    194 3 1 0
                    194 4 1 0
                    198 1 1 0
                    198 2 1 0
                    198 3 1 0
                    198 4 1 0
                    end
                    
                    gen profile = strofreal(rational) + strofreal(statusquo) if round == 1 
                    bysort CASE (round) : replace profile = profile[_n-1] + strofreal(rational) + strofreal(statusquo) if round > 1
                    by CASE : replace profile = profile[_N]
                    
                    bysort CASE : gen nrounds = _N 
                    bysort profile : egen pcount = total(nrounds == 4 & round == 4)
                    
                    graph hbar pcount if nrounds == 4 & round == 4, over(profile, sort(1) descending) ysc(alt) ytitle(Instances of profile)
                    
                    egen group = group(rational statusquo) 
                    egen tag = tag(CASE group)
                    bysort CASE: egen nchoices = total(tag)
                    gen changed = nchoices > 1 
                    
                    tabdisp CASE if nrounds == 4 & round == 4, c(profile nchoices changed) 
                    
                    ----------------------------------------------
                         CASE |    profile    nchoices     changed
                    ----------+-----------------------------------
                           79 |   10101010           1           0
                          104 |   10011010           2           1
                          128 |   10101010           1           0
                          129 |   10000000           2           1
                          140 |   10101010           1           0
                          155 |   10100010           2           1
                          160 |   00101010           2           1
                          171 |   10001000           2           1
                          172 |   10101000           2           1
                          176 |   10001010           2           1
                          180 |   10101110           2           1
                          181 |   10101010           1           0
                          189 |   10001010           2           1
                          191 |   00101010           2           1
                          194 |   10101010           1           0
                          198 |   10101010           1           0
                    ----------------------------------------------

                    Comment


                    • #11
                      Thank you very much! I appreciate the detailed code!

                      Comment

                      Working...
                      X