Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Q: Testing the equality of two proportions from independent samples

    I have a question related to testing the equality of two proportions from independent samples. I have typically used the Pearson Chi-squared or Exact test in Stata for this type of thing. I am wondering if degrees of freedom accounts for very small proportions in either or both samples.

    I’ll provide an example to better illustrate what I am asking…

    Imagine that I have collected samples from two neighborhoods. I asked all those sampled about their feelings towards a new city ordinance. Using the Chi-squared test, I can infer whether it is likely that these two neighborhoods are homogenous or different in their feelings about the ordinance.

    As far as I understand, the degrees of freedom used to conduct a Chi-squared test accounts for the size of the sample collected from both neighborhoods. I am wondering how I account for a small overall proportion. Going back to my example, imagine I sampled 1000 people from each neighborhood but only 20 from neighborhood X supported the ordinance while 30 from Y supported the ordinance.

    I am wondering if the calculation for degrees of freedom with Chi-squared accounts for a small numerator in the proportion calculation or if there is an alternative test that accounts for this… hopefully something available in Stata.

    Thanks in advance.

  • #2
    No, the degrees of freedom does not account for sample size or sparse frequencies in the Chi-Squared test. (You can a 2X2 table with N = 10,000 and a 2X2 table with N = 10, and in either case, the df = 1). In any event, what matters here to the validity of the p-value you obtain from the Chi-Squared test , however, is not the sample size itself, but rather whether the expected frequencies, under the null hypothesis of no population difference, are "too small." (See any standard textbook for a more detailed discussion of this, and to get an opinion about what constitutes "too small.")

    With Stata, you never need to worry about this. If you simply use the "exact" option on -tabulate2-, you will obtain a p-value that uses a different method (Fisher's Exact Test) that does not depend on the assumptions of sufficiently large expected frequencies, and which tests essentially the same hypothesis you want to examine.

    Regards, Mike

    Comment


    • #3
      The proper test to use depends on how you selected your participants. With any kind of realistic sampling design, you would need to svyset your data and use the tests in svy: tabulate. So exactly what design will you be using.
      Steve Samuels
      Statistical Consulting
      [email protected]

      Stata 14.2

      Comment

      Working...
      X