Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • test-retest reliability - what effects are fixed?

    Hi there,
    I am checking that I have correctly specified the variables for test-retest reliability.

    I have participants (ID) who took a questionnaire with three survey instruments on two seperate occassions. I'm looking to compare the score agreement across time between the three instruments.


    Here's my data in long format

    input int ID byte(time can) float(pwi who)
    211 1 7 45 12
    211 2 8 48 9
    212 1 7 52 18
    212 2 8 56 20
    213 1 7 52 17
    213 2 8 57 16
    214 1 7 49 8
    214 2 7 52 5
    215 1 7 56 19
    215 2 . . .
    216 1 9 59 22
    216 2 8 62 20
    217 1 6 46 16
    217 2 7 48 13
    218 1 . . .
    218 2 9 67 21
    219 1 7 50 10
    219 2 6 57 10
    220 1 7 56 20
    220 2 7 56 17
    222 1 7 59 8
    222 2 8 58 12
    223 1 . . .
    223 2 9 61 24
    224 1 7 45 11
    224 2 . . .
    225 1 7 58 16
    225 2 7 55 15
    226 1 8 55 16
    226 2 8 58 12
    227 1 6 54 20
    227 2 7 58 22
    228 1 5 44 13
    228 2 8 47 15
    229 1 7 49 13
    229 2 7 51 15
    230 1 9 60 18
    230 2 9 61 20
    231 1 8 65 20
    231 2 9 58 .
    232 1 8 55 17
    232 2 7 60 19
    233 1 . . .
    233 2 7 52 15
    234 1 6 41 14
    234 2 5 44 13
    235 1 5 46 13
    235 2 6 49 13
    236 1 6 50 11
    236 2 7 48 12
    237 1 7 55 19
    237 2 7 58 18
    238 1 8 57 19
    238 2 9 60 18
    239 1 7 55 17
    239 2 6 53 16
    240 1 9 67 21
    240 2 10 58 25
    241 1 7 48 16
    241 2 8 51 17
    242 1 8 66 23
    242 2 9 64 22
    243 1 . . .
    243 2 . 57 17
    244 1 6 44 14
    244 2 7 52 17
    245 1 7 53 13
    245 2 6 47 14
    end
    [/CODE]

    To test reliability of one survey instrument, can, I ran:
    Code:
    icc can ID time, mixed absolute
    (5 targets omitted from computation because not rated by all raters)


    But this model is estimating timepoint as the fixed effect, rather than the participants retaking the same survey.


    output:

    Code:
    Intraclass correlations
    Two-way mixed-effects model
    Absolute agreement
    
    Random effects: ID               Number of targets =        28
     Fixed effects: time             Number of raters  =         2
    
    --------------------------------------------------------------
                       can |        ICC       [95% conf. interval]
    -----------------------+--------------------------------------
                Individual |   .5894074       .2834621     .786455
                   Average |   .7416694       .4417148    .8804644
    --------------------------------------------------------------
    F test that
      ICC=0.00: F(27.0, 27.0) = 4.25              Prob > F = 0.000
    Q1: in the output, does Individual and Average refer to the correlation between timepoints within an individual and between individuals within timepoints, respectively?
    Q2: is there a way to specify the fixed effects for ID using icc?

    Because I've seen it recommended, I've also used -kappaetc-, icc

    input int ID byte(can1 can2) float(pwi1 pwi2 who1 who2)
    237 7 7 55 58 19 18
    230 9 9 60 61 18 20
    224 7 . 45 . 11 .
    216 9 8 59 62 22 20
    245 7 6 53 47 13 14
    211 7 8 45 48 12 9
    239 7 6 55 53 17 16
    234 6 5 41 44 14 13
    238 8 9 57 60 19 18
    241 7 8 48 51 16 17
    236 6 7 50 48 11 12
    232 8 7 55 60 17 19
    242 8 9 66 64 23 22
    219 7 6 50 57 10 10
    240 9 10 67 58 21 25
    212 7 8 52 56 18 20
    215 7 . 56 . 19 .
    231 8 9 65 58 20 .
    227 6 7 54 58 20 22
    213 7 8 52 57 17 16
    225 7 7 58 55 16 15
    233 . 7 . 52 . 15
    244 6 7 44 52 14 17
    214 7 7 49 52 8 5
    218 . 9 . 67 . 21
    223 . 9 . 61 . 24
    222 7 8 59 58 8 12
    229 7 7 49 51 13 15
    243 . . . 57 . 17
    228 5 8 44 47 13 15
    217 6 7 46 48 16 13
    235 5 6 46 49 13 13
    226 8 8 55 58 16 12
    220 7 7 56 56 20 17
    end
    [/CODE]


    here I can see the interrater reliability - ICC(3,1)- is very similar to what the icc command found above.


    . kappaetc can1 can2 , icc(mixed) listwise

    Interrater reliability Number of subjects = 28
    Two-way mixed-effects model Ratings per subject = 2
    ------------------------------------------------------------------------------
    | Coef. F df1 df2 P>F [95% Conf. Interval]
    ---------------+--------------------------------------------------------------
    ICC(3,1) | 0.6193 4.25 27.00 27.00 0.000 0.3262 0.8037
    ---------------+--------------------------------------------------------------
    sigma_s | 0.8622
    sigma_e | 0.6760
    ------------------------------------------------------------------------------


    Q3: Is ID the fixed effect in the kappaetc , icc(mixed) estimate by default here?

    Q4: I would like to report both the individual and average ICC coefficients (assuming I've interpreted these correctly above)- is it possible to see determine the group average ICC using kappaetc (similar to the icc command output)?

  • #2
    Hi Lana,

    I think it might be easier for you to get what you want by using the xtreg and mixed commands. You have a situation where individuals are rating themselves through their responses to a set of items. We don't see those but you provide us the sum in your example data. In a case such as this, the ID variable captures both the target and rater. So the best model to capture test-retest reliability here is the one-way random effects model with a random intercept for ID. There are no "fixed effects" in this model. From the random effect variances, we can calculate the ICC(1,1), which is a measure of test-retest reliability.
    Code:
    mixed can || ID: , reml    // use REML estimation because of small sample size
    estat icc    // ICC(1,1)
    
    Intraclass correlation
    
    ------------------------------------------------------------------------------
                           Level |        ICC   Std. err.     [95% conf. interval]
    -----------------------------+------------------------------------------------
                              ID |   .5786739   .1248276      .3348918    .7893159
    ------------------------------------------------------------------------------
    You asked about treating ID as a fixed effect instead. You can do that by using xtreg and specifying a fixed effects model. The rho parameter in the output, below, is the analogue to ICC(1,1) in the fixed effect model:
    Code:
    xtreg can, fe i(ID)
    
    Fixed-effects (within) regression               Number of obs     =         61
    Group variable: ID                              Number of groups  =         33
    
    R-squared:                                      Obs per group:
         Within  =      .                                         min =          1
                                                                  avg =        1.8
                                                                  max =          2
    
                                                    F(0, 28)          =       0.00
    corr(u_i, Xb) =      .                          Prob > F          =          .
    
    ------------------------------------------------------------------------------
             can | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
           _cons |   7.311475   .0921383    79.35   0.000     7.122739    7.500212
    -------------+----------------------------------------------------------------
         sigma_u |  1.0037807
         sigma_e |  .71962292
             rho |  .66051791   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    F test that all u_i=0: F(32, 28) = 3.54                      Prob > F = 0.0005
    Note that you do not get a confidence interval for rho whereas you do with estat icc after mixed. In terms of the individual and average ICC coefficients, I don't see the need for anything but the individual ICC(1,1) in your case. The code for obtaining the ICC(1,1) in kappaetc is the following:
    Code:
    * First reshape the data to wide
    reshape wide can pwi who, i(ID) j(time)
    
    kappaetc can1 can2 , icc(oneway)
    
    Interrater reliability                           Number of subjects =      33
    One-way random-effects model               Ratings per subject: min =       1
                                                                    avg =  1.8485
                                                                    max =       2
    ------------------------------------------------------------------------------
                   |   Coef.     F     df1     df2      P>F   [95% Conf. Interval]
    ---------------+--------------------------------------------------------------
          ICC(1,1) |  0.5789   3.54    32.00   28.00   0.001    0.2708     0.7726
    ---------------+--------------------------------------------------------------
           sigma_s |  0.8438
           sigma_e |  0.7196
    ------------------------------------------------------------------------------
    Note: F test and confidence intervals are based on methods for complete data.
    Last edited by Erik Ruzek; 13 May 2025, 11:30. Reason: Clarified fixed vs random effect.

    Comment


    • #3
      Thanks Erik for your thorough reply! I was reading more about the two-way and one-way models - Koo et al (2016) has been recommended - they say it's important to use 2-way models for test-retest, and the fixed effects when a sample was not randomly selected. That said, the mixed/fixed effects model equations they present are identical. Really appreciate the clarification and output interpretation you provided!

      Comment

      Working...
      X