Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Error r(134) in roccomp

    Hi everyone.
    I am trying to compare the AUCs of two tests, y1 and y2. The outcome variable (true diagnosis) is d. Here is the data.
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int id byte d float(y1 y2)
      1 1 97.38156  97.77513
      2 0 94.54882  96.23346
      3 1 95.96194  96.88808
      4 1 97.37487   98.6192
      5 1 96.90598  98.96457
      6 1 97.74158  99.81046
      7 0 94.44688   96.3835
      8 1 96.91759  100.8302
      9 0   98.467  96.79236
     10 1  97.2261  99.45723
     11 1 99.53806 100.39708
     12 1 98.74596  98.60378
     13 1 96.04809   99.2196
     14 1 98.51422  99.31129
     15 1 97.31383  99.73836
     16 0 94.75156  96.97417
     17 0 94.19542  96.18056
     18 0 95.30164  96.71478
     19 1 95.78667 100.39079
     20 0 97.41148  95.08833
     21 1 95.67726  100.2699
     22 0 97.12939  97.31715
     23 1  97.3907  99.41492
     24 0 95.52853   97.0461
     25 1 96.75761  100.5035
     26 0 97.53485  94.27595
     27 0 95.81331  96.68634
     28 0 96.65748   97.2637
     29 1 97.63087  95.29101
     30 0 96.06285  96.19601
     31 0 94.70712  98.38924
     32 0 95.18496  97.19393
     33 1 98.05997  99.90926
     34 0 95.84086  95.10242
     35 0 96.52688  94.40955
     36 0 96.66033  96.91363
     37 1 98.81554  93.85376
     38 0 96.29855  98.48356
     39 1 96.61803  96.31312
     40 0 95.86975  97.12144
     41 0 94.13004  95.98446
     42 1 98.06756  100.0735
     43 0 96.71018  96.03593
     44 1  99.6264   98.5117
     45 1 97.72781  95.48792
     46 0  95.1233  96.65819
     47 0 94.98343  96.83074
     48 0 98.22735  97.79552
     49 1 97.98679 100.16672
     50 1 96.93982  97.46316
     51 1 97.67142  99.97642
     52 1 98.35726  99.37061
     53 0 97.21593  98.75429
     54 1 96.12872  96.21509
     55 0 93.90992  96.28863
     56 1 98.66654   95.8676
     57 0 95.51481  96.32138
     58 0 94.91967   97.3297
     59 1 95.38461  97.11926
     60 0 96.89056  98.73277
     61 1 98.89187  97.73084
     62 1 95.07276  96.39565
     63 1 99.12607  95.99722
     64 1 98.36385   98.8508
     65 0 95.37592  97.13871
     66 1 97.55559 102.80803
     67 1  97.7218  97.59555
     68 0 97.19405  95.34322
     69 1 98.51645   99.4153
     70 1 97.09347  96.32613
     71 0 95.92873  97.27116
     72 1 97.90533  96.61362
     73 1  96.1864  98.67364
     74 0 94.40533  96.75537
     75 0 96.17095   96.4313
     76 0 97.84145 100.12045
     77 0  95.6085  97.15792
     78 1  98.7226  98.37483
     79 1 98.30137 100.99195
     80 0 96.33913  96.68325
     81 0  97.2898  96.74976
     82 0 96.95675  98.62692
     83 1 95.87126  97.53024
     84 0 94.96371  98.83896
     85 0 96.48267  95.86141
     86 1 97.00181  98.60984
     87 0 94.53855  97.66732
     88 0 96.80022  98.43723
     89 0 96.18893  97.01451
     90 1 98.26919  101.9634
     91 1 97.15147 100.51557
     92 0 94.70248  96.46716
     93 1 97.59052  99.70178
     94 1 98.61035   96.0057
     95 1 97.50181  98.77481
     96 1  97.8288  100.3194
     97 0 95.44022  95.43479
     98 0 95.16041  97.44152
     99 0  96.6447  95.25821
    100 1 97.05354  99.16766
    end
    label values d pn
    label def pn 0 "No", modify
    label def pn 1 "Yes", modify


    When I run
    Code:
    roccomp d y1 y2, summary graph binormal
    it displays the following error message
    __00000E takes on more than 800 unique values
    unable to fit model; use roctab command

    r(134);

    What is happening here?
    What is the possible solution to this error?


  • #2
    I cannot replicate this problem in the example data. I suspect that your real data set is much larger and that somewhere inside -roccomp- the amount of data is more than it can handle for some reason. One possibility would be to take a random sample of your data, with fewer than 800 observations, and run -roccomp- with that.

    Alternatively, the error message itself suggests using -roctab-. Now, -roctab- will not directly compare the two AUC's for you, but it will give you the AUC's and their standard errors separately, and then, assuming conditional independence of y1 and y2, you can compare them yourself.
    Code:
    roctab d y1
    local auc1 = r(area)
    local se1 = r(se)
    roctab d y2
    local auc2 = r(area)
    local se2 = r(se)
    
    local diff = `auc1' - `auc2'
    local se_diff = sqrt(`se1'^2 + `se2'^2)
    
    display "Difference in AUC's is `diff', standard error `se_diff'"
    From the difference and the standard error you can then calculate a confidence interval for the difference, and, if you wish, the z-statistic and p-value, in the usual way.

    The only issue with this approach is the assumption of conditional independence. In the example data, the y1:y2 correlation when d = Yes is 0.0228, and when d = No it is 0.0812. I would be comfortable treating the .0228 as zero, a bit less so with the 0.0812. But see how those correlations work out in your full data. With luck, both of them will be very close to zero and you can proceed comfortably with this.

    Comment


    • #3
      Thank you Clyde Schechter
      Indeed, I had included only the first 100 observations from my dataset.
      Thank you for the suggestion.
      However, roctab doesn't have a binormal option. So instead of using roctab separately for y1 and y2, I can simply
      Code:
      roccomp d y1 y2
      My intention was to compare the AUCs assuming a binormal distribution.

      Comment

      Working...
      X