Calculating ICC

Katrina Anderson

Join Date: Aug 2019

Posts: 16
#1

Calculating ICC

08 Dec 2019, 17:03

Hi statalist,

I am trying to understand how to calculate ICC to assess agreement between and within two raters,

I have two raters that are scoring facial expressions in 38 horses (10 photos per horse = 380)

What is my:
1. Dependent variable
2. Target variable
3. Rater variable

Any advice towards calculation or preparation of data would be much appreciated.

Thankyou,
Katrina
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30119
#2

08 Dec 2019, 17:18

Your dependent variable is the numerical score given to a horse by a rater. The target variable is a variable that identifies horses. The rater variable is a variable that identifies raters. Each observation in your data set should correspond to a single rating of a single horse by one rater and contain all three of these variables.
Comment
daniel klein

Join Date: Mar 2014

Posts: 3860
#3

08 Dec 2019, 23:13

For intra-rater reliability, see kappaetc (SSC or SJ) with option icc(). It requires a different data setup but handles repeated measures.

If you need more assistance, post an excerpt of your data or example data that resembles it.

Best
Daniel
Comment

Marcos Almeida

Join Date: Apr 2014
Posts: 4047

09 Dec 2019, 04:58

daniel klein By the way, recently, I estimated the icc under the alpha command. I have 3 measurements of a continuous variable, 2 were done by rater1 and the last one by rater 3.

The ICCs are quite similar, but not exactly the same. Apart from the plus of having the CIs under the user-written kappaetc, I'm just wondering whether both results are trustful, or I should select the last one.

Code:

. alpha var_A1 var_A2 var_A3

Test scale = mean(unstandardized items)

Average interitem covariance:     609.2527
Number of items in the scale:            3
Scale reliability coefficient:      0.9969

. kappaetc var_A1 var_A2 var_A3, icc(oneway)

Interrater reliability                           Number of subjects =      32
One-way random-effects model                    Ratings per subject =       3
------------------------------------------------------------------------------
               |   Coef.     F     df1     df2      P>F   [95% Conf. Interval]
---------------+--------------------------------------------------------------
      ICC(1,1) |  0.9909 326.55    31.00   64.00   0.000    0.9837     0.9952
---------------+--------------------------------------------------------------
       sigma_s | 24.6834
       sigma_e |  2.3695
------------------------------------------------------------------------------

Best regards,

Marcos

Comment

daniel klein

Join Date: Mar 2014

Posts: 3860
#5

09 Dec 2019, 06:00

Marcos

Are you sure that Cronbach's alpha is supposed to be equivalent to ICC(1,1)? I do recall that there is (under certain conditions) an equivalence between alpha and one of the ICC formulations but I cannot remember which one that is (and lack the time to dive into this right now). Perhaps someone else can help here?

Note that kappaetc does not estimate the average versions of ICC.

Best
Daniel
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#6

09 Dec 2019, 06:47

Thank you for the reply, daniel klein. I thought they were - to so extent - similar but I must be wrong.

Best regards,

Marcos
Comment

Katrina Anderson

Join Date: Aug 2019
Posts: 16

15 Dec 2019, 16:38

Thankyou for your help Clyde Schechter and Marcos Almeida .

I used

Code:

icc var1 Slidenumber observation if observation ~=3

for intra-reliability

Yes daniel klein I have just learnt about

Code:

 kappaetc

but I am not 100% sure on how to use it.
I will attach some of my data

Slide number	1	2	3	4	5	6	Observation
21	1	0	0	2	1	0	1 (1a)
293	0	0	0	0	1	0	1
229	2	2	1	0	1	0	1
308	2	0	2	0	1	1	1
15	1	0	0	1	0	1	1
39	2	2	2	1	2	1	1
216	0	0	0	0	1	0	1
21	1	2	0	0	1	0	2 (1b)
293	0	0	1	1	1	0	2
229	2	1	1	2	0	0	2
308	1	1	2	0	1	1	2
15	0	0	0	1	0	0	2
39	2	2	1	2	2	1	2
216	0	1	0	1	0	0	2
21	1	1	1	1	1	1	3 (2a)
293	2	1	0	0	1	0	3
229	0	0	1	1	1	0	3
15	0	1	1	0	1	0	3
39	2	1	1	2	1	1	3

This is an example of my data.
The slidenumber refers to an individual photo that is scored - individual identification. There are a total of 380 slides (10 per horse).
The 6 variables are the different descriptions to be scored for the photos. And the observation is the rater - first or second observation. 1 and 2 are the first observer (first and repeated observation) and observation ==3 is the first observation for the 2nd observer.
Hope this makes sense. Perhaps there is a better way to organise my data?

Comment

Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2403
#8

15 Dec 2019, 19:52

I haven't used daniel klein's useful program, but the typical data set up for reliability testing requires data to be in a "long" format. This is what Clyde Schechter referred to in #2. This looks like:

Code:

subject_id rater replicate score 1 1 1 ... 1 1 2 ... 1 1 3 ... 1 2 1 ... 1 2 2 ... 1 2 3 ... 2 1 1 ... ...

Here subject_id is your horse identifier, rater is the judge, replicate indexes each measurement (in your case, 1 to 10) and score refers to the "outcome" value from each rater.
Comment
daniel klein

Join Date: Mar 2014

Posts: 3860
#9

16 Dec 2019, 04:26

Stata's icc will not handle repeated measurements by the same rater (intra-rater reliability). If you want that, you will have to get the respective ICC values from the corresponding mixed model. kappaetc handles repeated measurements but requires data in wide format (one variable per observer/rater).

Also, your setup seems to be more complex than the initial post implies. Not only do you have 10 slides nested in 38 horses but you also have 6 descriptions per photo. How are you planning to analyze those? Note that neither icc nor kappaetc will deal with more than two levels (subjects and raters). Form what you write in #7 it appears as if you were willing to treat each photo as an independent observation/subject to be rated. That is fine, but it is not what you implied in the initial post.

Another thing that I have noticed from the excerpt of your data (please use dataex for that next time) is that the scores appear to be categorical in nature, ranging from 0 to 2. I am not sure that ICC (which is based on variances) is the most appropriate measure for this kind of data.

Irrespective of the approach you will end up using, I would first recode the obseravtion variable that is supposed to (but does not seem to) identify the observers/raters. If 1 and 2 both identify the same observer/rater then it really should be the same identifier, either 1 or 2 but not both. The same is true for the horse-identifier (which is not even part of the data excerpt so far).

Best
Daniel
1 like
Comment
Katrina Anderson

Join Date: Aug 2019

Posts: 16
#10

16 Dec 2019, 21:31

daniel klein I have converted the data to wide format so it now looks like this.

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input byte HorseID int Slidenumber str19 Name byte(StifflyEarsBackward11 OrbitalTightening11 Tensionabovetheeye11 StifflyEarsBackward21 OrbitalTightening21 Tensionabovetheeye21 StifflyEarsBackward12 OrbitalTightening12 Tensionabovetheeye12) <redacted> end

StifflyEarsBackward11 = observer 1 / observation 1
StifflyEarsBackward21 = observer 2 / observation 1
StifflyEarsBackward12 = observer 1 / observation 2

Last edited by sladmin; 20 Jun 2022, 07:26. Reason: Personally identifiable information removed.
Comment

daniel klein

Join Date: Mar 2014
Posts: 3860

#11

17 Dec 2019, 03:46

Obviously, I was not clear enough; sorry. If you are willing to treat each slide as an independent observation (i.e., ignore the nesting in horses) then you would need two variables that identify the observers/raters. The repeated measures would be recorded as separate observations (rows). So in a way, the observers/raters are in wide format while the repeated measures are still in long format.

Here, I draw on your example data

Code:

// reshape the data
rename *(#)(#) *(#)_(#)
reshape long ///
    StifflyEarsBackward1_ OrbitalTightening1_ Tensionabovetheeye1_ ///
    StifflyEarsBackward2_ OrbitalTightening2_ Tensionabovetheeye2_ ///
    , i(Slidenumber)

// estimate ICC (mixed, assuming raters are the fixed)
kappaetc StifflyEarsBackward* , icc(mixed) i(Slidenumber)

The code yields

Code:


Inter/Intrarater reliability                     Number of subjects =     100
Two-way mixed-effects model                Ratings per subject: min =       2
                                                                avg =    2.92
                                                                max =       3
                                             Replicates per subject =       2
------------------------------------------------------------------------------
               |   Coef.     F     df1     df2      P>F   [95% Conf. Interval]
---------------+--------------------------------------------------------------
      ICC(3,1) |  0.4763   5.41    99.00   99.00   0.000    0.5208     0.7308
    ICC_a(3,1) |  0.7928   4.28   163.43   92.00   0.000    0.6672     0.8398
---------------+--------------------------------------------------------------
       sigma_s |  0.4792
      sigma_sr |  0.2393
       sigma_e |  0.2739
------------------------------------------------------------------------------
Note: F test and confidence intervals are based on methods for complete data.

where ICC_a denotes intra-rater reliability.

It appears as if only one of the two observers scores the slides repeatedly. Obviously, results for intra-rater reliability might not generalize well to other raters.

Best
Daniel

Comment

Katrina Anderson

Join Date: Aug 2019
Posts: 16

#12

05 Feb 2020, 20:21

daniel klein thankyou for your help.
I have reshaped the data as you said, but i am not 100% sure it is correctly formatted for

Code:

kappaetc

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input byte observation int Slidenumber byte(HorseID StifflyEarsBackward1 StifflyEarsBackward2 OrbitalTightening1 OrbitalTightening2 Tensionabovetheeye1 Tensionabovetheeye2)
1   1 19 0 . 0 .   2 .
1   2 27 0 . 1 .   0 .
1   3 22 2 . 0 .   2 .
1   4 14 1 . 0 .   0 .
1   6 34 1 . 2 .   0 .
1   7 35 0 . 0 .   0 .
1   8 14 1 . 0 .   0 .
1   9 25 0 . 0 .   0 .
1  10 36 0 . 0 .   0 .
1  12 28 2 . 1 .   0 .
1  13  8 0 . 2 .   0 .
1  14 27 1 . 1 . 100 .
1  15  3 0 . 0 .   1 .
1  16 16 0 . 0 .   0 .
1  17 28 2 . 0 .   0 .
1  19 33 1 . 0 .   1 .
1  20 20 2 . 0 .   0 .
1  21 31 1 . 1 .   1 .
1  22 17 1 . 0 .   1 .
1  25 36 0 . 0 .   0 .
1  26 15 0 . 0 .   0 .
1  27 19 2 . 0 .   0 .
1  28 10 0 . 0 .   0 .
1  29 32 0 . 0 .   1 .
1  30 36 0 . 0 .   2 .
1  32 23 1 . 0 .   1 .
1  33 25 0 . 0 .   0 .
1  34  6 0 . 1 .   2 .
1  35 37 0 . 0 .   1 .
1  36 19 0 . 1 .   0 .
1  37  8 0 . 1 .   0 .
1  38 37 1 . 0 .   1 .
1  39 32 0 . 0 .   0 .
1  40 36 0 . 0 .   1 .
1  41 11 1 . 0 . 100 .
1  42 28 1 . 0 .   1 .
1  43 28 1 . 0 .   0 .
1  44  5 0 . 0 .   0 .
1  45 17 0 . 0 .   2 .
1  46 11 0 . 0 .   1 .
1  47 30 2 . 0 .   1 .
1  48 11 0 . 1 .   2 .
1  49 37 0 . 0 .   0 .
1  50 33 0 . 0 .   0 .
1  51 15 0 . 0 .   1 .
1  52 20 2 . 0 .   0 .
1  53 29 0 . 0 .   0 .
1  54 34 2 . 0 .   0 .
1  55 26 0 . 0 .   0 .
1  56 24 1 . 0 .   0 .
1  57 13 0 . 1 .   0 .
1  58 19 2 . 0 .   1 .
1  59 30 1 . 1 .   1 .
1  60 36 1 . 2 .   0 .
1  61 30 0 . 0 .   0 .
1  62  1 0 . 1 .   2 .
1  63 28 2 . 1 .   1 .
1  64 18 0 . 1 . 100 .
1  65 18 1 . 1 .   2 .
1  66  5 0 . 0 .   0 .
1  67 33 0 . 1 .   0 .
1  68 13 2 . 1 .   0 .
1  69 10 1 . 0 .   0 .
1  70 13 1 . 0 .   1 .
1  71 14 1 . 1 .   1 .
1  72 22 1 . 0 .   2 .
1  73 23 2 . 0 .   1 .
1  74 15 1 . 0 .   0 .
1  75 16 0 . 0 .   0 .
1  76 21 1 . 0 .   1 .
1  77 27 1 . 1 .   0 .
1  79 30 0 . 1 .   0 .
1  80 16 0 . 0 .   2 .
1  81 34 0 . 0 .   0 .
1  82  5 0 . 0 .   2 .
1  83  3 0 . 0 .   2 .
1  84  4 0 . 1 .   0 .
1  86 12 1 . 2 .   1 .
1  87 12 1 . 0 . 100 .
1  88  1 1 . 0 .   2 .
1  89 22 1 . 0 .   2 .
1  90  4 1 . 0 .   0 .
1  91  6 1 . 0 .   0 .
1  92 27 0 . 2 .   0 .
1  93 35 1 . 0 .   0 .
1  95 24 2 . 0 .   1 .
1  97 23 2 . 2 .   1 .
1  98 21 0 . 0 .   1 .
1  99 13 1 . 1 .   1 .
1 100 17 2 . 1 .   2 .
1 101 20 2 . 0 .   0 .
1 102  3 0 . 0 .   1 .
1 103 36 0 . 0 .   0 .
1 104  3 2 . 0 .   2 .
1 105 18 2 . 0 .   1 .
1 106 33 0 . 2 .   0 .
1 107 15 0 . 0 .   0 .
1 108 26 2 . 1 .   1 .
1 109 33 0 . 0 .   0 .
1 110 36 1 . 0 .   0 .
end

which continues for observer 1 (740 rows of data), for observer 2 the data is allocated to the variables 2 (StifflyEarsBackward2 OrbitalTightening2 Tensionabovetheeye2 Prominentchewingmuscles2). So the data from observer 1 and observer is not formatted directly below one another. Does that make sense?
StifflyEarsBackward1 is allocated to observer 1 first observations (n=360 observations) and second observations (n=380) therefore the subject = Slidenumber.

My question is basically: can I use

Code:

kappaetc

correctly with the data in this format.

side note = i have missing values which I have labelled 100 in the data, I could essentially make them "." - how does the kappaetc deal with this?

Thankyou so much,
Katrina

Comment

Katrina Anderson

Join Date: Aug 2019
Posts: 16

#13

05 Feb 2020, 20:34

Code:

. kappaetc  StifflyEarsBackward*, icc (mixed) i(Slidenumber)

Inter/Intrarater reliability                     Number of subjects =     380
Two-way mixed-effects model                Ratings per subject: min =       3
                                                                avg =  3.9474
                                                                max =       4
                                        Replicates per subject: min =       3
                                                                avg =  3.9474
                                                                max =       4
------------------------------------------------------------------------------
               |   Coef.     F     df1     df2      P>F   [95% Conf. Interval]
---------------+--------------------------------------------------------------
      ICC(3,1) | -0.1492   1.04   379.00  379.00   0.334    0.0000     0.0950
    ICC_a(3,1) |  0.4922   2.28   690.11  740.00   0.000    0.4231     0.5541
---------------+--------------------------------------------------------------
       sigma_s |  4.9903
      sigma_sr |  6.8240
       sigma_e |  8.5864
------------------------------------------------------------------------------
Confidence interval is clipped at the lower limit.
Note: F test and confidence intervals are based on methods for complete data.

Would this be correct daniel klein ?

Thankyou!!

Comment

Katrina Anderson

Join Date: Aug 2019
Posts: 16

#14

06 Feb 2020, 21:09

no worries. I have actually worked it out. I generated another variable A and B for observation 1 and 2. Then generated another variable by concatenating the Slidenumber and the A B variable. Using this to reshape to wide worked.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input int Slidenumber float obs_12 str4 obs_AB byte(StifflyEarsBackward1 OrbitalTightening1 StifflyEarsBackward2 OrbitalTightening2)
100 1 "100A" 2 1 2 1
100 2 "100B" 2 0 2 0
101 1 "101A" 2 0 1 .
101 2 "101B" 2 0 1 0
102 1 "102A" 0 0 1 1
102 2 "102B" 0 0 0 0
103 1 "103A" 0 0 0 1
103 2 "103B" 0 0 0 0
104 1 "104A" 2 0 1 0
104 2 "104B" 2 0 1 0
105 1 "105A" 2 0 2 .
105 2 "105B" 2 1 2 0
106 1 "106A" 0 2 1 2
106 2 "106B" 0 2 1 2
107 1 "107A" 0 0 1 1
107 2 "107B" 0 0 1 0
108 1 "108A" 2 1 2 1
108 2 "108B" 2 1 2 1
109 1 "109A" 0 0 0 0
109 2 "109B" 0 0 0 0
 10 1 "10A"  0 0 1 0
 10 2 "10B"  0 0 1 0
110 1 "110A" 1 0 1 0
110 2 "110B" 1 0 1 0
111 1 "111A" . . 0 0
111 2 "111B" 0 0 0 0
112 1 "112A" 0 1 1 .
112 2 "112B" 0 1 1 1
113 1 "113A" 0 0 1 0
113 2 "113B" 0 0 0 0
114 1 "114A" 1 0 1 1
114 2 "114B" 1 0 1 0
115 1 "115A" 0 0 1 .
115 2 "115B" 0 0 1 0
116 1 "116A" 1 0 1 0
116 2 "116B" 1 0 1 0
117 1 "117A" 2 1 2 1
117 2 "117B" 2 1 2 1
118 1 "118A" 0 0 1 0
118 2 "118B" 0 0 1 0
119 1 "119A" 1 0 1 0
119 2 "119B" 0 0 1 0
 11 1 "11A"  . . 1 0
 11 2 "11B"  0 0 1 0
120 1 "120A" 2 1 2 1
120 2 "120B" 1 1 1 1
121 1 "121A" 0 1 1 1
121 2 "121B" 0 1 1 1
122 1 "122A" 1 0 1 0
122 2 "122B" 1 0 1 0
124 1 "124A" 2 1 1 2
124 2 "124B" 2 2 1 2
125 1 "125A" 0 0 0 0
125 2 "125B" 0 0 0 0
126 1 "126A" 2 0 1 0
126 2 "126B" 2 0 1 0
127 1 "127A" 1 0 1 1
127 2 "127B" 1 1 1 1
128 1 "128A" 1 0 1 0
128 2 "128B" 1 0 1 0
129 1 "129A" 2 0 1 0
129 2 "129B" 2 0 2 0
 12 1 "12A"  2 1 1 0
 12 2 "12B"  1 0 1 0
131 1 "131A" 1 0 1 0
131 2 "131B" 1 0 1 0
132 1 "132A" 0 0 1 0
132 2 "132B" 0 0 1 0
133 1 "133A" 2 1 2 1
133 2 "133B" 2 1 2 1
134 1 "134A" 0 0 1 1
134 2 "134B" 0 0 1 0
135 1 "135A" 1 0 1 0
135 2 "135B" 0 0 1 0
136 1 "136A" 0 0 0 1
136 2 "136B" 0 1 0 1
137 1 "137A" 1 0 1 0
137 2 "137B" 0 0 1 0
138 1 "138A" 1 0 1 0
138 2 "138B" 1 0 1 0
139 1 "139A" 1 0 1 0
139 2 "139B" 0 0 1 0
 13 1 "13A"  0 2 1 1
 13 2 "13B"  1 1 1 1
140 1 "140A" 1 1 1 0
140 2 "140B" 0 0 1 0
141 1 "141A" 2 0 2 0
141 2 "141B" 2 0 1 0
142 1 "142A" 2 1 1 0
142 2 "142B" 2 1 2 .
143 1 "143A" . . 2 1
143 2 "143B" 1 1 1 1
144 1 "144A" 1 0 1 0
144 2 "144B" 0 0 1 0
145 1 "145A" 2 0 2 1
145 2 "145B" 2 1 1 1
146 1 "146A" 0 0 1 .
146 2 "146B" 0 0 1 0
147 1 "147A" 0 0 1 .
147 2 "147B" 0 1 1 .
end

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment