The following data represents two people who reviewed 100 unique t-shirts (uniqueid) and ranked whether each shirt is among their 25 most favourite shirts (rank1top25 for person 1 and rank1top25 for person 2).
I want to test whether the proportion of blue shirts (blue) is the same in their top 25 ranking (rank1top25 versus rank2top25).
I then want to test whether the distribution of the unique shirts (uniqueid) is the same in their top 25 ranking (rank1top25 versus rank2top25).
I am not sure how to conduct the first analysis but thought maybe a chi-square for the second. But I am not sure.
Code:
clear input uniqueid blue rank1top25 rank2top25 1 0 0 0 2 0 0 0 3 1 0 0 4 1 0 0 5 0 0 0 6 0 0 1 7 1 0 1 8 0 1 1 9 0 1 0 10 0 0 0 11 0 0 0 12 0 0 0 13 1 0 1 14 1 0 1 15 0 0 0 16 0 0 0 17 0 1 0 18 1 0 0 19 0 1 1 20 1 0 0 21 0 0 0 22 0 1 0 23 0 0 1 24 1 0 0 25 0 1 0 26 1 0 0 27 1 0 0 28 0 0 0 29 1 0 1 30 0 1 0 31 0 0 0 32 0 0 0 33 0 1 1 34 0 1 0 35 0 0 0 36 0 1 0 37 0 0 1 38 1 0 1 39 0 0 0 40 0 0 1 41 0 0 1 42 0 0 0 43 0 0 0 44 0 0 1 45 0 0 1 46 0 0 0 47 0 0 0 48 1 0 0 49 1 0 0 50 0 1 1 51 0 1 0 52 0 0 1 53 0 0 1 54 1 1 0 55 1 0 1 56 0 0 0 57 0 0 0 58 0 0 1 59 0 0 0 60 1 1 0 61 1 1 0 62 0 0 0 63 0 1 0 64 1 0 0 65 0 1 0 66 0 0 0 67 0 0 0 68 1 0 0 69 0 1 0 70 1 1 0 71 0 1 0 72 0 0 0 73 0 1 0 74 0 0 1 75 0 0 0 76 0 0 1 77 1 0 0 78 0 0 0 79 1 0 0 80 0 0 0 81 0 1 1 82 0 0 0 83 1 0 0 84 0 0 0 85 0 0 0 86 1 0 0 87 0 0 1 88 0 0 1 89 0 1 0 90 1 0 0 91 1 0 0 92 0 1 0 93 0 0 0 94 1 0 0 95 0 0 0 96 0 0 0 97 0 0 0 98 1 0 0 99 0 0 0 100 0 1 0 end
I then want to test whether the distribution of the unique shirts (uniqueid) is the same in their top 25 ranking (rank1top25 versus rank2top25).
I am not sure how to conduct the first analysis but thought maybe a chi-square for the second. But I am not sure.
Comment