Assessing agreement in several binary variables in longitudinal dataset

Tyler Wray

Join Date: Apr 2015
Posts: 19

Assessing agreement in several binary variables in longitudinal dataset

18 Feb 2019, 08:50

Hi all! I've got a longitudinal dataset that consists of binary "ratings" collected from participants each day across two different methods, and I'm trying to figure out the best way to assess day-level agreement across the two methods. So, I have a long dataset in which the ratings collected are recorded in two separate variables (liver1, liver2, and dbs1, dbs2) each day for 30 days (studyday) by each participant (id). So, basically the data look like this:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input double id float(studyday svysubmitdate liver1 liver2) byte(dbs1 dbs2)
 9  1 20607 0 1 0 0
 9  2 20608 1 . 0 .
 9  3 20609 1 . 0 .
 9  4 20610 0 . 0 .
 9  5 20611 0 . 0 .
 9  6 20612 0 0 0 0
 9  7 20613 0 1 0 0
 9  8 20614 1 1 0 0
 9  9 20615 1 1 0 0
 9 10 20616 0 1 0 0
 9 11 20617 0 0 0 0
 9 12 20618 0 0 0 0
 9 13 20619 0 0 0 0
 9 14 20620 0 1 0 0
 9 15 20621 0 1 0 0
 9 16 20622 1 1 0 0
 9 17 20623 1 1 0 0
 9 18 20624 0 0 0 0
 9 19 20625 0 0 0 0
 9 20 20626 0 0 0 0
 9 21 20627 0 0 0 0
 9 22 20628 0 1 0 0
 9 23 20629 1 1 0 0
 9 24 20630 1 1 0 0
 9 25 20631 1 0 0 0
 9 26 20632 0 0 0 0
 9 27 20633 0 0 0 0
 9 28 20634 1 0 0 0
 9 29 20635 1 1 0 0
17  1 20612 0 1 0 0
17  2 20613 0 0 0 0
17  3 20614 0 0 1 0
17  4 20615 0 1 1 0
17  5 20616 0 1 0 0
17  6 20617 0 1 1 0
17  7 20618 0 1 0 0
17  8 20619 0 0 0 0
17  9 20620 0 0 0 0
17 10 20621 0 0 0 0
17 11 20622 1 1 0 0
17 12 20623 0 1 0 0
17 13 20624 1 0 0 0
17 14 20625 0 0 1 0
17 15 20626 0 0 0 0
17 16 20627 0 0 0 0
17 17 20628 0 0 0 0
17 18 20629 0 1 0 0
17 19 20630 0 1 0 0
17 20 20631 0 0 0 0
17 21 20632 0 0 1 0
17 22 20633 0 0 1 0
17 23 20634 0 0 1 0
17 24 20635 0 0 1 0
17 25 20636 0 0 0 0
17 26 20637 0 1 1 0
17 27 20638 0 1 0 0
17 28 20639 0 1 0 0
17 29 20640 0 0 0 0
41  1 20607 1 1 1 0
41  2 20608 1 . 1 .
41  3 20609 1 . 1 .
41  4 20610 0 . 0 .
41  5 20611 1 . 1 .
41  6 20612 1 1 1 0
41  7 20613 1 1 1 0
41  8 20614 1 1 1 0
41  9 20615 1 1 1 0
41 10 20616 1 0 1 0
41 11 20617 0 0 1 0
41 12 20618 1 1 1 0
41 13 20619 1 1 1 0
41 14 20620 1 1 1 0
41 15 20621 1 1 1 0
41 16 20622 0 1 1 0
41 17 20623 1 1 1 0
41 18 20624 0 1 0 0
41 19 20625 . 1 . 0
41 20 20626 1 1 1 0
41 21 20627 1 1 1 0
41 22 20628 0 1 1 0
41 23 20629 0 0 1 0
41 24 20630 . 1 . 0
41 25 20631 1 1 1 0
41 26 20632 1 1 1 0
41 27 20633 . 1 . 0
41 28 20634 1 1 1 0
41 29 20635 1 1 1 0
end
format %tdNN/DD/YY svysubmitdate

Basically what I'd like to know is to what extent do the ratings provided via these two methods agree on a given day. But, there's a ton of different ways of calculating ICCs/Kappas, so I guess I'm most curious about which one might be best, given these repeated measures & longitudinal data. For example, it seems like one route might be to reshape long again to get a variable that reflects assessment method (1, 2) and calculate ICCs after a mixed model like:

Code:

reshape long liver dbs, i(id studyday) j(method)
xtmelogit liver || method: || id: , variance
estat icc

Or, maybe ICC(3) could be a decent fit, since I at least have a "random sample" and fixed raters? So something like:

Code:

 
 kappaetc liver1 liver2, icc(mixed)

Of course, these two methods (as well as the more general icc command produce pretty wildly different estimates. So, any thoughts about what might be the best fit for this? Or approaches I should explore a bit more? Thanks!

Tags: None

daniel klein

Join Date: Mar 2014

Posts: 3886
#2

18 Feb 2019, 11:13

I am not sure which approach might be the most appropriate but here are some thoughts on your current approach.

The kappaetc (from SSC or SJ) call completely ignores the nested structure of the data. Try adding the i() option that identifies repeatedly measured subjects.

Code:

kappaetc liver1 liver2 , icc(mixed) i(id)

Likewise, I believe your mixed (logit) model might lack one level; you have 2 methods, 3 individuals, and 29 study days. Your model. however, only includes the former two levels. Also, when you want the equivalent to ICC(3), that is, fixed raters, consider adding method in the fixed part of your model.

Obviously, kappaetc estimates a linear model, while xtmelogit estimates a non-linear model. The latter might be more appropriate, given your binary ratings. However, note that results get a lot closer when you compare

Code:

kappaetc liver1 liver2 , icc(mixed) i(id)

with

Code:

xtmixed liver i.method || id : , reml

In more recent versions of Stata, you want to use melogit and mixed instead of the xt* commands.

Best
Daniel
1 like
Comment
Tyler Wray

Join Date: Apr 2015

Posts: 19
#3

18 Feb 2019, 18:45

Ah, this is fantastic, Daniel, thank you! Didn't know I could specify i(id) within kappaetc, so I apologize for not reading your help file closely enough. In any case, thank you again for all of the help you offer on here! I've benefited from reading quite a few of your answers by now!
Comment

Joseph Coveney

Join Date: Apr 2014
Posts: 4449

18 Feb 2019, 22:48

Originally posted by Tyler Wray View Post

I'm trying to figure out the best way to assess day-level agreement
. . . to what extent do . . . methods agree on a given day.

So, wouldn't you want to put day in, too?

Code:

version 15.1

clear *

set seed `=strreverse("1483418")'
quietly set obs 250
generate int pid = _n
generate double pid_u = rnormal()

quietly expand 29
bysort pid: generate byte day = _n

quietly expand 2
bysort pid day: generate byte mtd = _n

generate double lat = 0 * mtd + 0 * day + ///
    pid_u + rnormal() // ICC = 0.5

// Latent
mixed lat i.mtd i.day || pid: , reml nolrtest nolog
estat icc

// Manifest
generate byte mft = lat > 0
meprobit mft i.mtd i.day || pid: , nolrtest nolog
estat icc

exit

Announcement

Assessing agreement in several binary variables in longitudinal dataset

Comment

Comment

Comment