Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • using Fishers Exact with 7 Race Categories?

    So when using Fisher's exact to tabulate association between Race and Screening I got an error message saying "exceeded memory limits" when it gets to the final enumeration of 'sample-space combinations' and in the help it says to try using exact(2), which I did and got the same error message. Any ideas?

    tab RaceEthnicityCoding ScreeningDone_Dummy, chi column exact(2)

    +-------------------+
    | Key |
    |-------------------|
    | frequency |
    | column percentage |
    +-------------------+

    Enumerating sample-space combinations:
    stage 7: enumerations = 1
    stage 6: enumerations = 11
    stage 5: enumerations = 321
    stage 4: enumerations = 9730
    stage 3: enumerations = 257651
    stage 2: exceeding 1x10^6 enumerations
    exceeded memory limits using exact(2); try again with larger #; see help tabulate for details

    Race + |
    Ethnicity | Screening Done_Dummy
    Coding | 0 1 | Total
    -----------+----------------------+----------
    1 | 36 44 | 80
    | 11.01 8.10 | 9.20
    -----------+----------------------+----------
    2 | 19 38 | 57
    | 5.81 7.00 | 6.55
    -----------+----------------------+----------
    3 | 17 18 | 35
    | 5.20 3.31 | 4.02
    -----------+----------------------+----------
    4 | 131 300 | 431
    | 40.06 55.25 | 49.54
    -----------+----------------------+----------
    5 | 5 5 | 10
    | 1.53 0.92 | 1.15
    -----------+----------------------+----------
    6 | 76 116 | 192
    | 23.24 21.36 | 22.07
    -----------+----------------------+----------
    7 | 43 22 | 65
    | 13.15 4.05 | 7.47
    -----------+----------------------+----------
    Total | 327 543 | 870
    | 100.00 100.00 | 100.00

    Pearson chi2(6) = 37.2129 Pr = 0.000
    r(910);


  • #2
    I doubt you're missing much -- except a different very small P-value.

    The chi-square test is best taken forward to look at the pattern of discrepancies. Here's tabchii from tab_chi on SSC with extra Pearson residuals, namely (observed MINUS expected) / sqrt(expected). First off, the P-values are of the order of 1 in a million and so clear-cut. I would bet that the Fisher test wouldn't contradict that. Second, rows 4 and 7 are those most out of line with a null.

    .
    Code:
     tabchii 36 44 \ 19 38 \ 17 18 \ 131 300 \ 5 5 \ 76 116 \ 43 22, pearson
    
              observed frequency
              expected frequency
              Pearson residual
    
    ----------------------------
              |       col      
          row |       1        2
    ----------+-----------------
            1 |      36       44
              |  30.069   49.931
              |   1.082   -0.839
              |
            2 |      19       38
              |  21.424   35.576
              |  -0.524    0.406
              |
            3 |      17       18
              |  13.155   21.845
              |   1.060   -0.823
              |
            4 |     131      300
              | 161.997  269.003
              |  -2.435    1.890
              |
            5 |       5        5
              |   3.759    6.241
              |   0.640   -0.497
              |
            6 |      76      116
              |  72.166  119.834
              |   0.451   -0.350
              |
            7 |      43       22
              |  24.431   40.569
              |   3.757   -2.915
    ----------------------------
    
    1 cell with expected frequency < 5
    
             Pearson chi2(6) =  37.2129   Pr = 0.000
    likelihood-ratio chi2(6) =  36.4749   Pr = 0.000
    
    . ret li
    
    scalars:
                      r(N) =  870
                      r(r) =  7
                      r(c) =  2
                   r(chi2) =  37.21292707219945
                      r(p) =  1.60034046607e-06
                r(chi2_lr) =  36.4749305559922
                   r(p_lr) =  2.22847158301e-06

    Comment


    • #3
      Thanks. I"m doing this for a med student who has to present the data at a conference this weekend and I can tell her not to worry about it but I just want to be able to explain to her what is happening and what to say if anyone asks her about it. Should she just report the p val of .000 for fishers exact and leave it at that? I don't want anyone to get called out for an error.

      Comment


      • #4
        You don't have a P-value for FIsher's exact test. At most you have my guess that it would be very small.

        I'd never report P-values of 0,000 because they are all too likely to be misunderstood. I would report the attained value for chi-square P-value, but using 3 or 4 significant figures. You could say P < 0.0005.

        It should be much more interesting to comment on why there is an apparent departure from null expectation and how far it is clinically interesting.

        PS Not a medic. Not a medical statistician.

        Comment


        • #5
          Ellen Kiley posted this in another thread

          Or I can just use Pearson's (which runs without error) because while 2 cells have only 5 observations, only 1 of the expected has <5 and that is way < 20%?
          That's a rather ancient rule of thumb (Cochran 1952???). I tend to go with a simpler rule, which can be found in the work of Harold Jeffreys and more recently of Stephen Fienberg, which is to worry only if expected frequencies fall below 1.

          Comment

          Working...
          X