Correlation between pairs of responses for one variable (icc and spearman)

Ian Ross

Join Date: May 2019

Posts: 4
#1

Correlation between pairs of responses for one variable (icc and spearman)

12 Dec 2019, 04:15

Hi stata users!

I am using Stata 15.1 on Mac. I want to investigate correlations between two people's responses on a variable. Each pair of people lives on the same "compound", and there are 152 pairs. This is a inter-rater reliability question, and each pair of people is different. I have tried to do this using stata functions icc and spearman, but I can't work out how to do it.

Please can a member advise on the two specific Qs below?

Data excerpt below this message.
- id is a unique id
- c2a is compound code
- p1 is a response to a question on a scale of always, sometimes, rarely, never (coded as 3,2,1,0)

1. Approach using ICC

When I code "icc p1 c2a if treatment == 1" I get an individual ICC of 0.05 (output below this message).
This makes no sense to me because 90% of respondents in the "treatment" group responded "always" for p1, so the icc should be relatively high.
This leads me to suspect I am doing something wrong with the icc command.
Can anyone advise?

2. Approach using Spearman's rank correlation coefficient

I considered restructuring the data by creating variables p1a and p1b (where p1a is person 1's response and p1b is person 2's) and then correlating those variables using the spearman function, then taking the mean of the spearman's rho across the dataset. However, I can't work out the best way of doing the restructuring, and it doesn't strike me as the most robust way of investigating correlations between responses either. Doing the right ICC would be better.
Can anyone advise?

Many thanks!
Ian

+---------------------------+
| id c2a p1 |
|---------------------------|
224. | 300402 3004 always |
225. | 300401 3004 never |
226. | 300902 3009 always |
227. | 300901 3009 never |
229. | 301602 3016 always |
|---------------------------|
230. | 301601 3016 always |
232. | 301901 3019 always |
233. | 301902 3019 always |
236. | 303001 3030 sometimes |
237. | 303002 3030 sometimes |
+---------------------------+

Random effects: c2a Number of targets = 80
Number of raters = 2

--------------------------------------------------------------
p1 | ICC [95% Conf. Interval]
-----------------------+--------------------------------------
Individual | .052705 -.1669207 .2676372
Average | .1001325 -.4007318 .4222615
--------------------------------------------------------------
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#2

12 Dec 2019, 15:11

When I code "icc p1 c2a if treatment == 1" I get an individual ICC of 0.05 (output below this message).

This makes no sense to me because 90% of respondents in the "treatment" group responded "always" for p1, so the icc should be relatively high.

I think you are working from a misunderstanding. Total response variance is the sum of variance between compounds and variance within compound. The ICC is the between compound variance divided by total variance. If you are saying that 90% of respondents in the treatment group responded always, that implies that the total variance is small. But it can still be partitioned any way between within compound and between compound: it does not constrain the ICC.

I considered restructuring the data by creating variables p1a and p1b (where p1a is person 1's response and p1b is person 2's) and then correlating those variables using the spearman function, then taking the mean of the spearman's rho across the dataset. However, I can't work out the best way of doing the restructuring

In order to restructure the data in this way there needs to be a variable that designates which person is person 1 and which is person 2 within a compound. If you have such a variable, you have not shown it in your example. If you don't have such a variable, then the restructuring simply cannot be done. Assuming you have such a variable, and assuming it is called which_person, the restructuring would be done as follows:

Code:

reshape wide p1 id, i(c2a) j(which_person)

Now, bear in mind that a Spearman correlation (or a Pearson correlation) between p1 and p2 is not quite analogous to an ICC. In the Spearman/Pearson correlation we have a distinction between person 1 and person 2. In the ICC there is no distinction between which person is which--they are, in the technical jargon, exchangeable. The ICC is somewhat like doing a Spearman/Pearson correlation after randomly choosing which person is person 1 and which is person 2. That's not exactly right, but you wouldn't go far wrong to think of it that way.

In the future, when showing data examples, please use the -dataex- command to do so. If you are running version 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.
Comment
Ian Ross

Join Date: May 2019

Posts: 4
#3

17 Dec 2019, 10:57

Dear Clyde,
Thanks - this was really helpful for getting my head around what I needed to do. Based on thinking a bit more, I reckon I should be using ICC (1,k), the "average" one, for the sample as a whole, not sub-samples. It gives a very similar coefficient to Spearman's correlations after reshaping wide in the way you suggest, which makes sense.
thanks again,
Ian
Comment

Announcement

Correlation between pairs of responses for one variable (icc and spearman)

Comment

Comment