Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • fisher's exact test does not reproduce the results in the literature

    Hello,

    I am trying to reproduce the results in the literature using fisher's exact test to compare the distribution of two independent samples.

    There is data description:

    Let's call the data from one paper 'sample one', and the data from another paper "sample two".
    Both of the two papers are measuring the same thing.
    In both of two samples, there are 6 types of subjects are identified: level 0, level 1, level 2, level 3, level 4, unidentified.
    In sample one, there are 116 subjects, the proportions of types are 5.17%, 23.28%, 26.72%, 21.55%, 22.41%, 0.86% , respectively.
    In sample two, there are 179 subjects, the proportions of types are 3.91%, 14.53%, 27.93%, 21.23%, 17.32%, 15.08% respectively.

    The paper itself says "If the unidentified subjects are excluded, the Fisher's exact test comparing these two categorical distributions yields a p-value of 0.926, suggesting that they are statistically not different."

    Thus, I assume that the Fisher's exact test will reject the null when unidentified subjects are included, which I am able to get, but I am not able to get "p-value of 0.926" to not reject the null excluding unidentified, so I am thinking the command I am using is not right.

    Here is the code I am using:
    Code:
    set obs 179
    gen jin = 0 in 1/7
    replace jin = 1 in 8/33
    replace jin = 2 in 34/83
    replace jin = 3 in 84/121
    replace jin = 4 in 122/152
    replace jin = -1 in 153/179 //unidentified
    proportion jin
    
    gen k=-1 in 1 //unidentified
    replace k=0 in 2/7
    replace k =1 in 8/34
    replace k=2 in 35/65
    replace k =3 in 66/90
    replace k = 4 in 91/116
    proportion k
    
    tabulate jin k , all exact //reject the null
    tabulate jin k if jin!=-1 & k != -1, all exact// reject
    My question what the right way is to reproduce the results. And I am wondering if sample sized matter as if I don't using option -missing-, the table it produces look like the larger sample is truncated, and if for example, shuffle the data, the larger sample will be truncated in a different way. so should we account for missing values if two samples are not balanced?

    I also tried other tests to compare two samples which give different results:

    Code:
     set obs 295
     gen group = 1 in 1/179
    replace group =0 in 180/295
    gen jin_k=jin in 1/179
    forvalues i = 1(1)116{
    replace jin_k = k[`i'] if _n == `i'+179
     }
     ranksum jin_k, by(group)//not reject at 5%
    median jin_k, by(group) exact//not reject
    ksmirnov jin_k, by(group) exact //not reject
    Further, I just realised from this topic, that level 0, level 1, level 2, level 3, level 4 are likely to be ordered category. (I am not sure actually, the category in the paper is like education taking values of high school, undergraduate, postgraduate.) Thus I am wondering if it is indeed ordered category, then fisher's exact test is not appropriate, then what about other test I have used?

    Finally, I have my own data measuring the same thing with 157 subjects. When comparing my sample to either sample one or two, I cannot reject the null using Fisher's exact test, but I can reject the null using all other tests -ranksum-, -median-, -ksmirnov-, and -ttest-. It seems that all these give different results from fisher exact test or chi square test, when either comparing sample one and two, or comparing my sample and sample one or two. I am really confused by those different results.

    Thanks for any help!!

  • #2
    I do not see how Fisher's Exact Test can be used to compare two different populations. Certainly as used in the tabulate command it compares two different measures within the same population.

    Comment


    • #3
      Notwithstanding William's comment, when I tried entering your data I got a p-value of 0.626...

      Code:
      clear
      local study1 "5.17 23.28 26.72 21.55 22.41 0.86"
      local study2 "3.91 14.53 27.93 21.23 17.32 15.08"
      local study1_nv ""
      foreach n of local study1 {
          local study1_n = round(116*`n'/100)
          local study1_nv = "`study1_nv'" + "`study1_n' "
      }
      local study2_nv ""
      foreach n of local study2 {
          local study2_n = round(179*`n'/100)
          local study2_nv = "`study2_nv'" + "`study2_n' "
      }
      display "`study1_nv'"
      display "`study2_nv'"
      tabi 6 27 31 25 26 \ 7 26 50 38 31, exact

      Comment


      • #4
        I can confirm the p-value of 0.626 that Dave Airey y reports. I used a similar approach to enter the data, but then reshaped long and used -tabulate- with frequency weights. I get the exact same result using R's fisher.test.

        Regarding William Lisowski comment, turns out the important question is what you are conditioning on.

        A two-way table may represent a cross-tabulation of two variables, in which case only the total is fixed, a multinomial distribution is appropriate, and one would usually test for independence.

        It may also represent the distribution of one variable in two groups, in which case the appropriate model is a product binomial distribution and one would test for homogeneity.

        But the chi-square tests of independence and homogeneity are exactly equivalent. Same test, different language.

        Fisher's exact test, on the other hand, considers both margins fixed. The appropriate distribution in this case is hypergeometric. The test is really conditional on both margins fixed.

        Comment


        • #5
          German Rodriguez

          Regarding my post #2, it was (too narrowly) based on on the post #1 presentation using tabulate twoway on a dataset of observations of two variables representing the same measurement in two populations, with unequal numbers of observations for the variables. I stated my concern too narrowly; reorganizing the data (as was done in post #1 for ksmirnov) would have solved the problem for tabulate twoway as well, as you correctly point out, as would using tabi to input just the margins, as Dave Airey demonstrated.

          Comment


          • #6
            Thanks for all the replies!

            Just want to be clear: so what I need to do when using fisher exact test is to re-organise the data to have fixed margin like what post #3 and #4 did?

            There is another comparison in the same literature:
            Treatment 1 has 80 subjects, with the frequency of level 0 to 4 of 6, 16, 23, 19, 16
            treatment 2 has 36 subjects with the frequency of level 0 to 4 of 1, 11, 8, 6, 10

            I used the method in #3 and got p value of 0.486 which is different from p value of 0.58 in the paper.

            Any ideas of different p values?

            Comment


            • #7
              No, "fixed margins" is not something that has to do with how you organize your data in Stata. "Fixed margins" refers to part of the theoretical model on which Fisher's Exact Test is based. I also get p = 0.486 for the data you show (see below). Could be a mistake in the original paper, or could be something different about the particular program (not Stata?) used in that paper. Fisher's Exact Test gets quite complicated beyond a 2 X 2 table.

              Code:
              input group score freq
              1 0 6
              1 1 16
              1 2 23
              1 3 19
              1 4 16
              2 0 1
              2 1 11
              2 2 8
              2 3 6
              2 4 10
              end
              expand freq
              tab2 score group, exact

              Comment

              Working...
              X