Exceeded Memory for Fisher's Exact Test

Kay Dee

Join Date: Jun 2016

Posts: 2
#1

Exceeded Memory for Fisher's Exact Test

23 Jun 2016, 11:28

I got the following error notice when trying to run a Fisher's exact test: "exceeded memory limits using exact(1); try again with larger #"
Sample size is only 120, but some cell counts are 0. I thought this was precisely what Fisher's exact was for?

This neophyte is grateful for the help.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30065
#2

23 Jun 2016, 12:26

From -help tabulate twoway-:

exact[(#)] ...The optional positive integer # is a multiplier on the amount of memory that the command is permitted to consume. The default is 1. This option should not be necessary for reasonable r x c tables. If the command terminates with error 910, try exact(2). The maximum row or column dimension allowed when computing Fisher's exact test is the maximum row or column dimension for tabulate (see limits).

When tables have large numbers of rows or columns, the memory requirements for the Fisher exact test grow very rapidly. The Fisher exact test is, indeed, intended for situations when the expected values in cells are small. But I wonder if you are pushing it beyond the limits of anything. How many rows and columns does your cross-tabulation have? Apart from the burden on memory, if you are scattering 120 observations across a large number of cells, you may be trying to test data that are simply too sparse for a meaningful analysis. If that is the case, you might improve the situation by combining some of the adjacent categories in each of your variables.
Comment
Mike Lacy

Join Date: Apr 2014

Posts: 2411
#3

23 Jun 2016, 12:31

I just tried an example with N = 120 and r = c = 5, and it required exact(2) to get a result. But I do agree with Clyde, that even if you do get a result, interpreting as meaningful would be hard with such sparse data.
Comment
Kay Dee

Join Date: Jun 2016

Posts: 2
#4

23 Jun 2016, 12:57

Thanks for the helpful feedback. I will try combining adjacent categories.
Comment
Seline Yun

Join Date: May 2018

Posts: 1
#5

04 May 2018, 13:46

I am facing the same problem, but I have a larger data (N=1500) with 19 columns and 2 rows. It's an accuracy data with the accuracy around 95%, so despite the large size of the data I have many cells that are <5. I wanted to test the difference across all 19 groups but Stata wouldn't give me Fisher's exact because of the memory limit. I even tried exact(1000), which is the maximum value allowed, but I doubt that having to go this far means that I'm trying a method that is not suit for the data. I tried to split the data into two sets of 10 groups with one group overlapping between the two and got the p=0.232 and 0.203. Is this a valid way to show insignificant difference across all groups?
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4398
#6

04 May 2018, 21:45

It's not clear what you mean by a valid way to show insignificant differences across all groups. Is this like one of those goodness-of-fit tests where the goal is to fail to reject some kind of null hypothesis of good fit? If you're testing for some kind of equivalence, for example, that all 19 groups have similar accuracy within some acceptable range (tolerance), then why not set up the null hypothesis for that instead?

You could test this hypothesis using conventional techniques for equivalence studies, such as whether the confidence intervals of group differences fall within your specified acceptance interval. For computing the confidence intervals, perhaps you could use a regression command—for example, penlogit or firthloigt (use the search command at Stata's command line to find these user-written commands)—that might be able to accommodate the small counts in the second row of cells.

Last edited by Joseph Coveney; 04 May 2018, 22:16. Reason: poor is good
1 like
Comment

Announcement

Exceeded Memory for Fisher's Exact Test

Comment

Comment

Comment

Comment

Comment