inter-rater reliability and icc

CJ Wilson

Join Date: Jan 2019

Posts: 7
#1

inter-rater reliability and icc

10 Jan 2019, 10:02

Hi all,

I am trying to calculate inter-rater reliability with a complicated study and data structure. Below is a (fake) example that illustrates the structure for 2 targets:

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input byte target str1 type byte rater str3 ph byte outcome 1 "A" 3 "MP1" 90 1 "B" 3 "MP1" 78 1 "A" 3 "MP2" 46 1 "B" 3 "MP2" 30 1 "A" 5 "MP1" 20 1 "B" 5 "MP1" 45 1 "A" 5 "MP2" 23 1 "B" 5 "MP2" 12 1 "A" 7 "MP1" 20 1 "B" 7 "MP1" 45 1 "A" 7 "MP2" 23 1 "B" 7 "MP2" 12 1 "A" 9 "MP1" 20 1 "B" 9 "MP1" 45 1 "A" 9 "MP2" 23 1 "B" 9 "MP2" 12 2 "A" 9 "MP1" 98 2 "B" 9 "MP1" 99 2 "A" 9 "MP2" 34 2 "B" 9 "MP2" 23 2 "A" 10 "MP1" 67 2 "B" 10 "MP1" 79 2 "A" 10 "MP2" 90 2 "B" 10 "MP2" 45 2 "A" 11 "MP1" 24 2 "B" 11 "MP1" 34 2 "A" 11 "MP2" 23 2 "B" 11 "MP2" 34 2 "A" 12 "MP1" 52 2 "B" 12 "MP1" 14 2 "A" 12 "MP2" 12 2 "B" 12 "MP2" 12 end

Characteristics of study:
Every target is rated by 4 raters. Note, the same set of raters does not rate each target.

Each rater rates ALL the data for 2 targets.

My analysis model is below. Specifically, there are 3 random intercepts.

mixed outcome indeps || _all: R.rater || _all: R.target || _all: R.ph
Looking at the manual for the command icc, options include:
one-way random-effects model: In the one-way random-effects model, each target is rated by a different set of k independent raters, who are randomly drawn from the population of raters. The target is the only random effect in this model; the effects due to raters and possibly due to rater-and-target interaction cannot be separated from random error.

two-way random-effects model: In the two-way random-effects model, each target is rated by the same set of k independent raters, who are randomly drawn from the population of raters. The random effects in this model are target and rater and possibly their interaction, although in the absence of repeated measurements for each rater on each target, the effect of an interaction cannot be separated from random error.

Questions:
1. In the two way random effects model, each target is rated by the same set of raters (does not seem true in this case). So can't use that. In the one way random effects model, each target is rated by a different set of raters (in my study however, each rater rates ALL the data for 2 targets, so each data is linked to 2 targets). So that doesn't seem quite true either. My question: Is it OK to use the one-way random effects model here? Or does the design of this study make calculation of inter-rater reliability impossible or not advised?

2. When one is calculating inter-rater reliability for a study with multiple outcome variables, does one typically calculate an inter-rater reliability score for each outcome measure? Or does typically one choose one measure?

Thank you!
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30179
#2

10 Jan 2019, 17:38

I think this structure is not suitable for the -icc- command.

Rather, I think after running your -mixed- model you can calculate an intraclass correlation directly as the variance component at the target level divided by the total of all variance components. This would not, strictly speaking, be an inter-rater reliability, because you also have another variance component at the level of variable ph. But it is properly in the spirit of reliability: it is the proportion of variance due to the target itself, and not to extraneous factors.

If you have several measures, each would warrant its own intraclass correlation assessment.
1 like
Comment

CJ Wilson

Join Date: Jan 2019
Posts: 7

11 Jan 2019, 09:43

Thank you! This is very useful. Using this formula: var(R.target) / var(R.rater) +var(R.target) +var(R.ph) +var(Residual) and the below output I get a value of 0.09 for one measure. On another measure this value comes 0.13.

My question: assuming I have used the right formula, does this indicate poor reliability since the values are so low?

Code:

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
_all: Identity               |
                var(R.rater) |   79.88473   15.98827      53.96383    118.2564
-----------------------------+------------------------------------------------
_all: Identity               |
               var(R.target) |   59.60948     17.466      33.56659    105.8579
-----------------------------+------------------------------------------------
_all: Identity               |
                   var(R.ph) |   42.60954   8.641743      28.63338    63.40757
-----------------------------+------------------------------------------------
               var(Residual) |   505.0235   6.432765      492.5716    517.7902
------------------------------------------------------------------------------
LR test vs. linear model: chi2(3) = 3110.89               Prob > chi2 = 0.0000

Comment

Martyn Sherriff

Join Date: Mar 2014

Posts: 120
#4

11 Jan 2019, 11:00

Is iccvar:
iccvar from http://fmwww.bc.edu/RePEc/bocode/i
'ICCVAR': module to calculate intraclass correlation (ICC) after xtmixed /
iccvar is a post-estimation command for xtmixed. After fitting / a 2, 3,
or 4 level model with a random intercept (random / slopes are not
supported), iccvar will calculate the / intraclass correlation (ICC)
of any use?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30179
#5

11 Jan 2019, 13:18

Re #3: Yes, you calculated correctly, and, yes, it indicates poor inter-rater reliability. Most of the variation in rating is attributable to noise (residual variance), and a bit to the rater and a bit to ph. Very little is attributable to the target itself.

Re #4: Hard to say. I'm not really familiar with this command. But reviewing the help file, I think it will not do the job here. The original question relates to a model with crossed random effects. The help file for iccvar does not say anything about the nesting/crossing/multiple membership structure of the effects, but it has no special options to allow the user to specify this. So, I'm guessing that it is designed only for use with nested models. (Although perhaps it is "smart" enough to read the syntax of the -mixed- command it follows and figure this out without user guidance, I don't know.)
Comment
CJ Wilson

Join Date: Jan 2019

Posts: 7
#6

14 Jan 2019, 08:20

Many thanks Clyde - this is very helpful!
Comment

Announcement

inter-rater reliability and icc

Comment

Comment

Comment

Comment

Comment