Kappa for many raters

LarsFolkestad

Join Date: Sep 2014

Posts: 165
#1

Kappa for many raters

27 Mar 2015, 04:09

Hello
I have asked two groups the same question.
The question can be answered a b c d e f g h i j
there are 100 unique raters in one group and 86 in the other.
In both groups 40% answered a and 40% answered b - the last 20% in each group answered c through j

I would like to test for if the two groups are in agreement, so i thought of using kappa statistic.

but how do i do that using stata when i have many rateres and one variable?

thank you
Lars
Tags: None
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#2

27 Mar 2015, 04:20

Hello Lars,

Please take a look on this (http://www.stata.com/manuals13/rkappa.pdf). I gather the examples 6 to 9 may help you.

Best,

Marcos

Best regards,

Marcos
Comment
daniel klein

Join Date: Mar 2014

Posts: 3872
#3

27 Mar 2015, 05:41

I do not believe agreement can be measured/tested at all, if there is only one rating (one question, in this particular example).

Best
Daniel
Comment
LarsFolkestad

Join Date: Sep 2014

Posts: 165
#4

27 Mar 2015, 05:45

you may be right daniel,
i have asked them several questions.
So perhaps a chi-2-test would be better.
Lars
Comment
daniel klein

Join Date: Mar 2014

Posts: 3872
#5

27 Mar 2015, 05:54

With more than one question, you should be able to calculate agreement. Marcos pointed to example 9 in the help. You will not get test-statistics, but you can calculate kappa.

I have no clue what your research question might be, and I am all but an expert, but for a conceptually different approach, testing reliability, see Heyes and Krippendorff (2007), Krippendorff (2004), Kirppendorff (2011). You can download the implementation kalpha or krippalpha, both from the SSC.

Best
Daniel

Hayes, Andrew F., Krippendorff, Klaus (2007). Answering the call for a standard reliability measure for coding data. Communication Methods and Measures, 1, pp. 77-89.

Krippendorff, Klaus (2011). Agreement and Information in the Reliability of Coding. Communication Methods and Measures, 5(2), pp. 93-112.

Krippendorff, Klaus (2004). Reliability in Content Analysis: Some common Misconceptions and Recommendations. Human Communication Research 30(3), pp. 411-433.
Comment
wbuchanan

Join Date: Mar 2014

Posts: 1362
#6

27 Mar 2015, 08:49

I think it may be a bit more complex than estimating coefficient Kappa. So, what is the reason for the 40% response rate to items a and b? Was the reason a planned missing design, or just low response rates? When you say you had 100 raters in one group and 86 in the other, does that mean respondents or people rating the responses? If you have 1000 respondents and each response is rated by a proportion of raters from each of the groups, was each response rated by only a single individual per group or multiple raters in each group? I've not heard of anyone implementing this in Stata just yet, but if your case is more like this last example you may want to consider something like a faceted Rasch model to estimate the respondent and rater item parameters jointly. You may also want to consider inter/intra class correations to examine within and between group correlations.
Comment

Announcement

Kappa for many raters

Comment

Comment

Comment

Comment

Comment