Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Assessing agreement in several binary variables in longitudinal dataset

    Hi all! I've got a longitudinal dataset that consists of binary "ratings" collected from participants each day across two different methods, and I'm trying to figure out the best way to assess day-level agreement across the two methods. So, I have a long dataset in which the ratings collected are recorded in two separate variables (liver1, liver2, and dbs1, dbs2) each day for 30 days (studyday) by each participant (id). So, basically the data look like this:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double id float(studyday svysubmitdate liver1 liver2) byte(dbs1 dbs2)
     9  1 20607 0 1 0 0
     9  2 20608 1 . 0 .
     9  3 20609 1 . 0 .
     9  4 20610 0 . 0 .
     9  5 20611 0 . 0 .
     9  6 20612 0 0 0 0
     9  7 20613 0 1 0 0
     9  8 20614 1 1 0 0
     9  9 20615 1 1 0 0
     9 10 20616 0 1 0 0
     9 11 20617 0 0 0 0
     9 12 20618 0 0 0 0
     9 13 20619 0 0 0 0
     9 14 20620 0 1 0 0
     9 15 20621 0 1 0 0
     9 16 20622 1 1 0 0
     9 17 20623 1 1 0 0
     9 18 20624 0 0 0 0
     9 19 20625 0 0 0 0
     9 20 20626 0 0 0 0
     9 21 20627 0 0 0 0
     9 22 20628 0 1 0 0
     9 23 20629 1 1 0 0
     9 24 20630 1 1 0 0
     9 25 20631 1 0 0 0
     9 26 20632 0 0 0 0
     9 27 20633 0 0 0 0
     9 28 20634 1 0 0 0
     9 29 20635 1 1 0 0
    17  1 20612 0 1 0 0
    17  2 20613 0 0 0 0
    17  3 20614 0 0 1 0
    17  4 20615 0 1 1 0
    17  5 20616 0 1 0 0
    17  6 20617 0 1 1 0
    17  7 20618 0 1 0 0
    17  8 20619 0 0 0 0
    17  9 20620 0 0 0 0
    17 10 20621 0 0 0 0
    17 11 20622 1 1 0 0
    17 12 20623 0 1 0 0
    17 13 20624 1 0 0 0
    17 14 20625 0 0 1 0
    17 15 20626 0 0 0 0
    17 16 20627 0 0 0 0
    17 17 20628 0 0 0 0
    17 18 20629 0 1 0 0
    17 19 20630 0 1 0 0
    17 20 20631 0 0 0 0
    17 21 20632 0 0 1 0
    17 22 20633 0 0 1 0
    17 23 20634 0 0 1 0
    17 24 20635 0 0 1 0
    17 25 20636 0 0 0 0
    17 26 20637 0 1 1 0
    17 27 20638 0 1 0 0
    17 28 20639 0 1 0 0
    17 29 20640 0 0 0 0
    41  1 20607 1 1 1 0
    41  2 20608 1 . 1 .
    41  3 20609 1 . 1 .
    41  4 20610 0 . 0 .
    41  5 20611 1 . 1 .
    41  6 20612 1 1 1 0
    41  7 20613 1 1 1 0
    41  8 20614 1 1 1 0
    41  9 20615 1 1 1 0
    41 10 20616 1 0 1 0
    41 11 20617 0 0 1 0
    41 12 20618 1 1 1 0
    41 13 20619 1 1 1 0
    41 14 20620 1 1 1 0
    41 15 20621 1 1 1 0
    41 16 20622 0 1 1 0
    41 17 20623 1 1 1 0
    41 18 20624 0 1 0 0
    41 19 20625 . 1 . 0
    41 20 20626 1 1 1 0
    41 21 20627 1 1 1 0
    41 22 20628 0 1 1 0
    41 23 20629 0 0 1 0
    41 24 20630 . 1 . 0
    41 25 20631 1 1 1 0
    41 26 20632 1 1 1 0
    41 27 20633 . 1 . 0
    41 28 20634 1 1 1 0
    41 29 20635 1 1 1 0
    end
    format %tdNN/DD/YY svysubmitdate
    Basically what I'd like to know is to what extent do the ratings provided via these two methods agree on a given day. But, there's a ton of different ways of calculating ICCs/Kappas, so I guess I'm most curious about which one might be best, given these repeated measures & longitudinal data. For example, it seems like one route might be to reshape long again to get a variable that reflects assessment method (1, 2) and calculate ICCs after a mixed model like:

    Code:
    reshape long liver dbs, i(id studyday) j(method)
    xtmelogit liver || method: || id: , variance
    estat icc
    Or, maybe ICC(3) could be a decent fit, since I at least have a "random sample" and fixed raters? So something like:

    Code:
     
     kappaetc liver1 liver2, icc(mixed)
    Of course, these two methods (as well as the more general icc command produce pretty wildly different estimates. So, any thoughts about what might be the best fit for this? Or approaches I should explore a bit more? Thanks!

  • #2
    I am not sure which approach might be the most appropriate but here are some thoughts on your current approach.

    The kappaetc (from SSC or SJ) call completely ignores the nested structure of the data. Try adding the i() option that identifies repeatedly measured subjects.

    Code:
    kappaetc liver1 liver2 , icc(mixed) i(id)
    Likewise, I believe your mixed (logit) model might lack one level; you have 2 methods, 3 individuals, and 29 study days. Your model. however, only includes the former two levels. Also, when you want the equivalent to ICC(3), that is, fixed raters, consider adding method in the fixed part of your model.

    Obviously, kappaetc estimates a linear model, while xtmelogit estimates a non-linear model. The latter might be more appropriate, given your binary ratings. However, note that results get a lot closer when you compare

    Code:
    kappaetc liver1 liver2 , icc(mixed) i(id)
    with

    Code:
    xtmixed liver i.method || id : , reml
    In more recent versions of Stata, you want to use melogit and mixed instead of the xt* commands.

    Best
    Daniel

    Comment


    • #3
      Ah, this is fantastic, Daniel, thank you! Didn't know I could specify i(id) within kappaetc, so I apologize for not reading your help file closely enough. In any case, thank you again for all of the help you offer on here! I've benefited from reading quite a few of your answers by now!

      Comment


      • #4
        Originally posted by Tyler Wray View Post
        I'm trying to figure out the best way to assess day-level agreement
        . . . to what extent do . . . methods agree on a given day.
        So, wouldn't you want to put day in, too?
        Code:
        version 15.1
        
        clear *
        
        set seed `=strreverse("1483418")'
        quietly set obs 250
        generate int pid = _n
        generate double pid_u = rnormal()
        
        quietly expand 29
        bysort pid: generate byte day = _n
        
        quietly expand 2
        bysort pid day: generate byte mtd = _n
        
        generate double lat = 0 * mtd + 0 * day + ///
            pid_u + rnormal() // ICC = 0.5
        
        // Latent
        mixed lat i.mtd i.day || pid: , reml nolrtest nolog
        estat icc
        
        // Manifest
        generate byte mft = lat > 0
        meprobit mft i.mtd i.day || pid: , nolrtest nolog
        estat icc
        
        exit

        Comment

        Working...
        X