Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problems with ICC analysis in stata

    I need to calculate ICC (absolute agreement) between the step counts from physical activity monitors (trackers) and the actual steps taken. I can find valid arguments for ICC(1,1,) and ICC (2,1), but I am leaning towards two-way random effects absolute agreement single rater (ICC2,1).

    The actual steps taken were visually counted (averaged between two counters and defined as the criterion).
    Each subject walked for six minutes wearing two monitors. n=103 giving 206 monitor measurements.

    My data looks like this:

    monitor_nr: monitor 1 og monitor 2
    ID: subject ID
    Criterion_target: actual steps taken
    monitor_measure: monitor measured steps

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte monitor_nr int ID double Criterion_target int monitor_measure
    1   1   486 450
    1   2   489 457
    1   3   533 551
    1   4   706 682
    1   5   677 694
    1   6   635 665
    1   7   482  55
    1   8   635 590
    1   9   582 554
    1  10   517 512
    1  11   624 672
    1  12   555 612
    1  13   532  94
    1  14   643 652
    1  15   520 493
    1  16   612 612
    1  17   656 671
    1  18   591 599
    1  19   589 523
    1  20   619 621
    1  21   586 593
    1  22   639 650
    1  23   427 346
    1  24 560.5 558
    1  25   424   0
    1  26   638 620
    1  27   645 657
    1  28   229  50
    1  29   490 498
    1  30   592 579
    1  31   655 664
    1  32   584 562
    1  33   529 540
    1  34   570 477
    1  35   661 551
    1  36   518 472
    1  37   569 602
    1  38   632 626
    1  39   397 125
    1  40   451 185
    1  41   449 369
    1  42   680 694
    1  43   551 580
    1  44   589 600
    1  45   574 576
    1  46   630 646
    1  47   521 517
    1  48   527 555
    1  49   633 631
    1  50   658 640
    1  51   607 576
    1  52   576 590
    1  53   636 655
    1  54   589 583
    1  55   593 604
    1  56   593 568
    1  57   505 516
    1  58   217   0
    1  59   603 590
    1  60   513 466
    1  61   455 468
    1  62   637 681
    1  63   437 387
    1  64   435 365
    1  65   680 706
    1  66   467 445
    1  67   581 541
    1  68   624 590
    1  69   387  53
    1  70   611 618
    1  71   365   0
    1  72   642 649
    1  73   669 679
    1  74   639 668
    1  75   689 724
    1  76   640 640
    1  77   767 803
    1  78   520 502
    1  79   313  99
    1  80   506 447
    1  81   531 524
    1  82   641 670
    1  83 664.5 659
    1  84   676 713
    1  85   498 511
    1  86   601 562
    1  87   458 349
    1  88   510 462
    1  89   697 706
    1  90   575 598
    1  91   528 126
    1  92   398 331
    1  93   467 416
    1  94   674 709
    1  95 315.5 595
    1  96   427  34
    1  97   550 609
    1  98   648 649
    1  99   802 832
    1 100   294  32
    1 101   537 386
    1 102   478  88
    1 103   465 372
    2   1   486 462
    2   2   489 483
    2   3   533 527
    2   4   706 710
    2   5   677 690
    2   6   635 623
    2   7   482 388
    2   8   635 595
    2   9   582 561
    2  10   517 508
    2  11   624 624
    2  12   555 561
    2  13   532 481
    2  14   643 649
    2  15   520 502
    2  16   612 638
    2  17   656 665
    2  18   591 597
    2  19   589 565
    2  20   619 611
    2  21   586 592
    2  22   639 623
    2  23   427 351
    2  24 560.5 584
    2  25   424 414
    2  26   638 640
    2  27   645 647
    2  28   229   0
    2  29   490 485
    2  30   592 427
    2  31   655 653
    2  32   584 586
    2  33   529 541
    2  34   570 521
    2  35   661 508
    2  36   518 488
    2  37   569 584
    2  38   632 632
    2  39   397 367
    2  40   451 345
    2  41   449 460
    2  42   680 706
    2  43   551 593
    2  44   589 605
    2  45   574 613
    2  46   630 611
    2  47   521 525
    2  48   527 567
    2  49   633 642
    2  50   658 653
    2  51   607 588
    2  52   576 577
    2  53   636 644
    2  54   589 566
    2  55   593 578
    2  56   593 581
    2  57   505 496
    2  58   217 237
    2  59   603 600
    2  60   513 503
    2  61   455 396
    2  62   637 725
    2  63   437 433
    2  64   435 405
    2  65   680 712
    2  66   467 266
    2  67   581 501
    2  68   624 587
    2  69   387 332
    2  70   611 636
    2  71   365 120
    2  72   642 643
    2  73   669 671
    2  74   639 656
    2  75   689 740
    2  76   640 633
    2  77   767 774
    2  78   520 445
    2  79   313  61
    2  80   506 512
    2  81   531 313
    2  82   641 664
    2  83 664.5 651
    2  84   676 704
    2  85   498 441
    2  86   601 620
    2  87   458 259
    2  88   510 368
    2  89   697 674
    2  90   575 682
    2  91   528 506
    2  92   398 435
    2  93   467 443
    2  94   674 642
    2  95 315.5 596
    2  96   427 250
    2  97   550 610
    2  98   648 616
    2  99   802 845
    2 100   294  36
    2 101   537 572
    2 102   478 424
    2 103   465 315
    end
    ------------------ copy up to and including the previous line ------------------


    I have tried this command for the one-way model

    Code:
    . icc monitor_measure Criterion_target
    (92 targets omitted from computation because of unbalanced data)
    insufficient number of targets
    And this command for the two-way model

    Code:
    . icc monitor_measure Criterion_target monitor_nr
    multiple observations per target and rater not allowed


    Can anybody help me with this?

    Thank you all for your help

  • #2
    This question seems very similar to the one that we have covered in length here.

    Could you elaborate on the difference and what new problems are caused now?

    When you want repeated measurements, i.e., 103 subjects measured twice, then Stata'sicc command cannot help you. You could set up the respective mixed model, or you can use kappetc (SSC). I think you want

    Code:
    sort ID
    kappaetc monitor_measure Criterion_target , icc(random) i(ID)
    or, for the one-way model

    Code:
    sort ID
    kappaetc  monitor_measure Criterion_target ,   icc (oneway) i(ID)
    Caution: Note that the current version of kappaetc requires you to sort the data on the identifier for repeated measures. This is due to a suspected bug in Mata's panelsetup() that I have reported here but never got an answer and then forgot about it. I will send an ad-hoc fixed version to Kit Baum for upload on SSC.

    Best
    Daniel
    Last edited by daniel klein; 09 Aug 2018, 02:53.

    Comment


    • #3
      Hi Daniel,
      I am sorry for creating two posts about the same topic.
      Thank you for the help back then. Now I need the ICC2,1 (for the absolute agreement).

      I've tried the following (using the codes from the previous post)

      I cannot get the stata ICC commands to work.

      Code:
      . generate target = _n
      
      . reshape long monitor_measure, i(target) j(Criterion_target)
      variable Criterion_target already exists
          Data are already long.
      r(110);
      
      . icc monitor_measure target Criterion_target
      (206 targets omitted from computation because not rated by all raters)
      no observations
      r(2000);


      I can get the two way, random effects model using the kappaetc, but how can I be sure it uses the absolute agreement model and not the consistency model?

      Code:
       kappaetc Criterion_target monitor_measure, icc(random)
      
      Interrater reliability                           Number of subjects =     206
      Two-way random-effects model                    Ratings per subject =       2
      ------------------------------------------------------------------------------
                     |   Coef.     F     df1     df2      P>F   [95% Conf. Interval]
      ---------------+--------------------------------------------------------------
            ICC(2,1) |  0.7277   7.00   205.00  205.00   0.000    0.6233     0.8012
      ---------------+--------------------------------------------------------------
             sigma_s |127.7616
             sigma_r | 25.8916
             sigma_e | 73.7347
      ------------------------------------------------------------------------------
      Thank you for your help.

      Comment


      • #4
        Originally posted by Rasmus Tolstrup View Post
        I can get the two way, random effects model using the kappaetc, but how can I be sure it uses the absolute agreement model and not the consistency model?
        Hm, the literature that I have used does not really get into the distinction between absolute agreement and consistency. I believe this is because there are only three estimators for five coefficients. You can read more about this in the Stata manual on icc, and I have discussed it here.

        Perhaps, I should add a paragraph on this in the help file now that I will have to send a revised version anyway.

        Note that you are ignoring the repeated measurements in you code above.Taking them into account yields

        Code:
        . sort ID
        
        . kappaetc Criterion_target monitor_measure , icc(random) i(ID)
        
        Inter/Intrarater reliability                     Number of subjects =     103
        Two-way random-effects model                    Ratings per subject =       4
                                                     Replicates per subject =       2
        ------------------------------------------------------------------------------
                       |   Coef.     F     df1     df2      P>F   [95% Conf. Interval]
        ---------------+--------------------------------------------------------------
              ICC(2,1) |  0.7287   9.26   102.00  102.00   0.000    0.6141     0.8115
            ICC_a(2,1) |  0.8688  14.25   115.27  206.00   0.000    0.8243     0.9040
        ---------------+--------------------------------------------------------------
               sigma_s |128.0744
               sigma_r | 25.6554
              sigma_sr | 49.9644
               sigma_e | 54.3373
        ------------------------------------------------------------------------------
        where ICC_a is the intra-rater reliability.

        [Edit]

        Come to think of it, the repeated measures might be misleading because the "true" count is always the same for two measures. Perhaps treating the 2*103 measures as 206 independent trails is indeed the better approach here. I lack the time right now to figure this out, conceptually.

        Best
        Daniel
        Last edited by daniel klein; 09 Aug 2018, 03:29.

        Comment


        • #5
          Hi Daniel,

          Thank you so much for your help.
          So basically in your opinion the code below will calculate the ICC 2,1 as a measure of the criterion validity for the step counters?

          Code:
           kappaetc Criterion_target monitor_measure , icc(random)
          Best regards,
          Rasmus

          Comment


          • #6
            I am not sure I did totally understand your research question or the design, but in principle: yes.

            If the two devices are different types or worn on the right and left arm or whatever, you might also want to look at the two devices separately. If so, that would be

            Code:
            . bysort monitor_nr : kappaetc Criterion_target monitor_measure , icc(random)
            
            -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
            -> monitor_nr = 1
            
            Interrater reliability                           Number of subjects =     103
            Two-way random-effects model                    Ratings per subject =       2
            ------------------------------------------------------------------------------
                           |   Coef.     F     df1     df2      P>F   [95% Conf. Interval]
            ---------------+--------------------------------------------------------------
                  ICC(2,1) |  0.6712   5.71   102.00  102.00   0.000    0.5110     0.7790
            ---------------+--------------------------------------------------------------
                   sigma_s |134.6914
                   sigma_r | 34.3120
                   sigma_e | 87.8162
            ------------------------------------------------------------------------------
            
            -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
            -> monitor_nr = 2
            
            Interrater reliability                           Number of subjects =     103
            Two-way random-effects model                    Ratings per subject =       2
            ------------------------------------------------------------------------------
                           |   Coef.     F     df1     df2      P>F   [95% Conf. Interval]
            ---------------+--------------------------------------------------------------
                  ICC(2,1) |  0.8152  10.61   102.00  102.00   0.000    0.7255     0.8754
            ---------------+--------------------------------------------------------------
                   sigma_s |121.0963
                   sigma_r | 16.5335
                   sigma_e | 55.2449
            ------------------------------------------------------------------------------
            Or, say you were interested in the agreement between the two devices, then one approach could be

            Code:
            . drop Criterion_target
            
            . reshape wide monitor_measure , i(ID) j(monitor_nr)
            (note: j = 1 2)
            
            Data                               long   ->   wide
            -----------------------------------------------------------------------------
            Number of obs.                      206   ->     103
            Number of variables                   3   ->       3
            j variable (2 values)        monitor_nr   ->   (dropped)
            xij variables:
                                    monitor_measure   ->   monitor_measure1 monitor_measure2
            -----------------------------------------------------------------------------
            
            . kappaetc monitor_measure1 monitor_measure2 , icc(random)
            
            Interrater reliability                           Number of subjects =     103
            Two-way random-effects model                    Ratings per subject =       2
            ------------------------------------------------------------------------------
                           |   Coef.     F     df1     df2      P>F   [95% Conf. Interval]
            ---------------+--------------------------------------------------------------
                  ICC(2,1) |  0.8176  10.39   102.00  102.00   0.000    0.7382     0.8740
            ---------------+--------------------------------------------------------------
                   sigma_s |162.6695
                   sigma_r | 16.3874
                   sigma_e | 75.0769
            ------------------------------------------------------------------------------
            Best
            Daniel
            Last edited by daniel klein; 10 Aug 2018, 08:23.

            Comment


            • #7
              Hi Daniel,

              Thank you for all your help!

              Best

              Comment


              • #8
                Hi Daniel (or any other),

                I am still having some trouble with the ICCs.

                I have chosen the 2,1 Two-way random, single measures, absolute agreement to represent the agreement between a criterion measure of steps taken (truestep) and a device monitored count.

                I am using this code
                Code:
                kappaetc var Truestep, icc(random)

                My data looks like this:

                Code:
                * Example generated by -dataex-. To install: ssc install dataex
                clear
                input double Truestep int(Vivohip Vivohand) str3 Shinehip str7(Shinehand Uphip Uphand)
                  517 512 647 "634" "236"     "618" "53"     
                  674 709 543 "580" "166"     "499" "0"      
                  387 332  58 "502" "142"     "175" "0"      
                  576 577  15 "774" "Notused" "236" "23"     
                  229  50 643 "616" "578"     "646" "578"    
                  489 457 436 "236" "94"      "328" "0"      
                  217 237  16 "."   "18"      "517" "Notused"
                  527 567 790 "."   "Notused" "750" "740"    
                  482  55  15 "636" "50"      "458" "0"      
                  486 462  41 "78"  "18"      "496" "Notused"
                  532  94 734 "."   "664"     "689" "594"    
                  521 525 807 "18"  "784"     "817" "Notused"
                  656 671   0 "620" "34"      "0"   "0"      
                664.5 651 607 "656" "Notused" "568" "585"    
                  619 621  18 "."   "32"      "323" "0"      
                  435 405   0 "512" "4"       "483" "0"      
                  635 595 283 "564" "Notused" "331" "0"      
                  458 349 537 "."   "Notused" "530" "542"    
                  643 649  11 "300" "92"      "47"  "0"      
                  518 488   7 "."   "Notused" "0"   "0"      
                  517 508   9 "562" "Notused" "46"  "0"      
                  661 508 609 "."   "Notused" "623" "566"    
                  449 460 532 "530" "662"     "641" "591"    
                315.5 595 658 "."   "Notused" "621" "620"    
                  498 441  17 "470" "6"       "407" "0"      
                  706 682 400 "."   "350"     "171" "318"    
                  656 665 677 "560" "572"     "535" "597"    
                  611 618 609 "."   "Notused" "628" "369"    
                  581 541 603 "600" "452"     "576" "615"    
                  640 640 586 "598" "556"     "554" "570"    
                  533 551 362 "638" "464"     "0"   "380"    
                  680 694 566 "."   "496"     "573" "378"    
                  575 682   0 "430" "26"      "424" "0"      
                  451 345 650 "534" "630"     "624" "629"    
                  582 554 714 "180" "702"     "702" "Notused"
                  612 638 491 "."   "Notused" "457" "395"    
                  582 561 622 "56"  "534"     "583" "Notused"
                  435 365 599 "604" "464"     "443" "420"    
                  574 576   9 "610" "6"       "43"  "0"      
                  619 611 601 "."   "Notused" "600" "595"    
                  551 593 241 "."   "Notused" "633" "533"    
                  669 671  34 "668" "Notused" "0"   "0"      
                  398 331 560 "."   "Notused" "521" "550"    
                  648 616  13 "640" "Notused" "0"   "Notused"
                  661 551 607 "."   "508"     "617" "581"    
                  689 740   6 "664" "Notused" "0"   "0"      
                  513 503 636 "520" "604"     "636" "551"    
                  592 579  49 "178" "32"      "453" "Notused"
                  510 462 502 "588" "456"     "493" "509"    
                  639 623 588 "560" "Notused" "523" "461"    
                  593 578 579 "522" "562"     "588" "618"    
                  658 640 672 "."   "664"     "645" "650"    
                  551 580   0 "."   "12"      "588" "35"     
                664.5 659 560 "."   "Notused" "578" "446"    
                  642 643 603 "500" "544"     "574" "598"    
                  455 468   0 "."   "Notused" "372" "0"      
                  674 642  20 "418" "10"      "0"   "0"      
                  607 576   0 "."   "54"      "0"   "0"      
                  601 620   0 "450" "22"      "365" "0"      
                  424 414 703 "."   "Notused" "687" "669"    
                  593 604 653 "610" "534"     "618" "625"    
                  603 600  11 "698" "Notused" "0"   "13"     
                  706 710 662 "."   "Notused" "709" "651"    
                  520 493 608 "622" "630"     "576" "314"    
                  467 443 615 "420" "462"     "617" "570"    
                  655 653 706 "."   "Notused" "698" "674"    
                  458 259 820 "648" "Notused" "958" "819"    
                  767 803 716 "."   "Notused" "676" "687"    
                  518 472  40 "."   "120"     "667" "0"      
                  397 367   0 "."   "Notused" "511" "0"      
                  467 416 542 "580" "492"     "594" "473"    
                  465 315 682 "412" "672"     "670" "639"    
                  520 445 515 "662" "Notused" "0"   "415"    
                  513 466   0 "608" "186"     "656" "32"     
                  570 477 560 "616" "468"     "572" "458"    
                  658 653 208 "."   "Notused" "0"   "0"      
                  427 250 632 "362" "656"     "636" "643"    
                  528 506   0 "424" "10"      "311" "0"      
                  586 592  11 "."   "Notused" "365" "0"      
                  550 609 500 "."   "Notused" "641" "656"    
                  387  53  17 "600" "26"      "638" "0"      
                  636 644  41 "."   "22"      "0"   "Notused"
                  697 674 671 "370" "552"     "653" "0"      
                  531 524   0 "593" "18"      "591" "0"      
                  669 679 335 "."   "Notused" "20"  "20"     
                  680 712  24 "680" "Notused" "0"   "0"      
                  624 672 141 "."   "46"      "29"  "0"      
                  478  88 378 "572" "Notused" "339" "88"     
                  637 681 695 "."   "Notused" "671" "689"    
                  482 388 227 "564" "Notused" "0"   "0"      
                  449 369 642 "612" "542"     "664" "626"    
                  398 435  42 "646" "Notused" "0"   "0"      
                  680 706 644 "."   "Notused" "650" "672"    
                  633 642 527 "."   "Notused" "498" "510"    
                  632 626 478 "."   "312"     "527" "60"     
                  589 583 603 "."   "534"     "440" "298"    
                  676 704 651 "653" "Notused" "507" "601"    
                  574 613 580 "524" "524"     "581" "472"    
                  467 445 615 "602" "592"     "627" "608"    
                  592 427 641 "40"  "586"     "652" "Notused"
                end

                I am excluding "zero counts" with the codes below:

                Code:
                destring var, replace force
                replace var=. if var==0
                drop if var ==.
                kappaetc var Truestep, icc(random)
                The ICCs comes out as follows:

                Shinehip: 0.11
                Vivohip: 0.73
                Uphip: 0.00

                How is it possible to get a ICC on 0.00 for Uphip.
                The mean difference and limits of agreement are a bit worse, okay, but still. I can't really get how it is possible.
                The calculations for Shinehand, Vivohand and Uphand are even worse.

                Any idea on what I am doing wrong?

                Thanks in advance and I appreciate any form of help.

                Rasmus

                Comment


                • #9
                  I am not well-versed on how Kappa works on continuous outcomes. I can understand correlation, though.

                  Code:
                  twoway scatter Truestep Vivohip
                  twoway scatter Truestep Uphip
                  corr Truestep Vivohip Uphip
                  The true number of steps correlates very strongly with the measure from Vivohip. It has a very very low (and negative!) correlation with Uphip. An ICC of 0 seems consistent with that pattern of data. So does the first scatterplot (especially in comparison with the second).
                  Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

                  When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

                  Comment


                  • #10
                    Originally posted by Rasmus Tolstrup View Post
                    The ICCs comes out as follows:

                    Shinehip: 0.11
                    Vivohip: 0.73
                    Uphip: 0.00

                    How is it possible to get a ICC on 0.00 for Uphip.
                    The mean difference and limits of agreement are a bit worse, okay, but still. I can't really get how it is possible.
                    The calculations for Shinehand, Vivohand and Uphand are even worse.

                    Any idea on what I am doing wrong?
                    Starting with the last question, I do not think that you do anything wrong in a statistical sense. Some of these devices just seem to be unreliable (in the sense of ICC). I also agree with Weiwen's observation that the results seem pretty consistent with what I see. Weiwen's graphical approach is the preferred one, but assuming that the first five observations are representative of the dataset, I see

                    Code:
                         +----------------------------------------------------------------------+
                         | Truestep   Vivohip   Vivohand   Shinehip   Shineh~d   Uphip   Uphand |
                         |----------------------------------------------------------------------|
                      1. |      517       512        647        634        236     618       53 |
                      2. |      674       709        543        580        166     499        0 |
                      3. |      387       332         58        502        142     175        0 |
                      4. |      576       577         15        774          .     236       23 |
                      5. |      229        50        643        616        578     646      578 |
                         +----------------------------------------------------------------------+
                    The smallest difference between Truestep and Uphip is 100 steps, when there are, on average, only 480 steps! Also, single measurements, such as counting more than 600 steps when there are just only over 200, do not give the impression of a reliable device. I therefore do not really agree with you when you say that "[..] the mean difference [...] are a bit worse"; I would say a mean difference of more than 100 steps, given an average of about 550 steps, is quite large.

                    The (negative) ICC estimate is indeed consistent with the negative correlation that Weiwen shows. Technically, the ICC is negative if the between-subject variance is very small compared to the within-subject (between rater/device) variance, and error variance. kappaetc truncates the lower limit of the ICC to 0 by replacing negative variance estimates with 0. This is indicated in the output

                    Code:
                    . kappaetc Truestep Uphip , icc(random)
                    
                    Interrater reliability                           Number of subjects =     100
                    Two-way random-effects model                    Ratings per subject =       2
                    ------------------------------------------------------------------------------
                                   |   Coef.     F     df1     df2      P>F   [95% Conf. Interval]
                    ---------------+--------------------------------------------------------------
                          ICC(2,1) |  0.0000   0.77    99.00   99.00   0.908    0.0000     0.0567
                    ---------------+--------------------------------------------------------------
                           sigma_s |  0.0000 (replaced)
                           sigma_r | 76.2527
                           sigma_e |204.4104
                    ------------------------------------------------------------------------------
                    Confidence interval is clipped at the lower limit.
                    You can estimate the ICC that Stata will give you

                    Code:
                    . display "ICC = " r(sigma2_s)/(r(sigma2_s)+r(sigma2_r)+r(sigma2_e))
                    ICC = -.11497974
                    To sum up, no aspect of the data at hand supports the hypothesis of a reliable device except for the Vivohip.

                    Concerning Weiwen's implied question, the (Cohen's) kappa coefficient used with quadratic weights approaches the ICC, as can be seen with kappaetc.

                    Code:
                    . kappaetc Truestep Vivohip , wgt(quadratic)
                    
                    Interrater agreement                             Number of subjects =     100
                    (weighted analysis)                             Ratings per subject =       2
                                                            Number of rating categories =     150
                    ------------------------------------------------------------------------------
                                         |   Coef.  Std. Err.    t    P>|t|   [95% Conf. Interval]
                    ---------------------+--------------------------------------------------------
                       Percent Agreement |  0.9817    0.0058 169.09   0.000     0.9702     0.9933
                    Brennan and Prediger |  0.7526    0.0787   9.57   0.000     0.5965     0.9087
                    Cohen/Conger's Kappa |  0.7176    0.0648  11.08   0.000     0.5890     0.8461
                        Scott/Fleiss' Pi |  0.7135    0.0670  10.65   0.000     0.5806     0.8464
                               Gwet's AC |  0.7571    0.0775   9.77   0.000     0.6033     0.9109
                    Krippendorff's Alpha |  0.7149    0.0670  10.67   0.000     0.5820     0.8478
                    ------------------------------------------------------------------------------
                    
                    . kappaetc Truestep Uphip , wgt(quadratic)
                    
                    Interrater agreement                             Number of subjects =     100
                    (weighted analysis)                             Ratings per subject =       2
                                                            Number of rating categories =     137
                    ------------------------------------------------------------------------------
                                         |   Coef.  Std. Err.    t    P>|t|   [95% Conf. Interval]
                    ---------------------+--------------------------------------------------------
                       Percent Agreement |  0.8963    0.0157  57.15   0.000     0.8652     0.9274
                    Brennan and Prediger | -0.7682    0.2673  -2.87   0.005    -1.0000    -0.2377
                    Cohen/Conger's Kappa | -0.1137    0.0536  -2.12   0.036    -0.2200    -0.0074
                        Scott/Fleiss' Pi | -0.2013    0.0632  -3.19   0.002    -0.3266    -0.0760
                               Gwet's AC | -0.6166    0.2014  -3.06   0.003    -1.0000    -0.2169
                    Krippendorff's Alpha | -0.1953    0.0632  -3.09   0.003    -0.3206    -0.0700
                    ------------------------------------------------------------------------------
                    Confidence intervals are clipped at the lower limit.
                    (Note that Brennan and Prediger's coefficient and Gwet's AC are not correctly estimated here, since the number of rating categories, if predetermined, is arguably larger than 137).

                    Best
                    Daniel
                    Last edited by daniel klein; 04 Oct 2018, 23:05.

                    Comment

                    Working...
                    X