Hi
I have complex survey data and wish to test if there is a difference in proportions between two sub-populations.
In the survey people were asked if they agreed or not with an opinion, very few people agreed. The problem is that for sub-population B the proportion disagreeing is 1, for sub-population A it is 0.9973. The two sub-populations sizes are n= 11443 and n=417.
So I looked in the https://www.stata.com/manuals13/svy.pdf which seems to suggest that Rao and Scott's (1984) second-order corrected Pearson statistic is the best test for sparse tables and they recommended using this statistic in all situations. The Pearson test found no significant difference between groups. I also outputted the Wald test which found a significant difference (from the Stata help page, it is mentioned that the Wald test can give erratic results for sparse tables). However, as there a cell with a zero proportion/count and the assumptions are not met for a Chi-squared test and I am not confident in the conclusion from the Pearson test. Is the Pearson test the right way to go for this data?
Code:
svy : tab row column,col obs pearson wald
/*
Number of strata = 16 Number of obs = 11,860
Number of PSUs = 11,860 Population size = 3,189,540
N. of poststrata = 40 Design df = 11,844
-------------------------------------
| Column
Row | A B Total
----------+--------------------------
0 | .9934 1 .9936
| 1.1e+04 417 1.2e+04
|
1 | .0066 0 .0064
| 69 0 69
|
Total | 1 1 1
| 1.1e+04 417 1.2e+04
-------------------------------------
Key: column proportion
number of observations
Pearson:
Uncorrected chi2(1) = 2.6635
Design-based F(1, 11844) = 2.7560 P = 0.0969
Wald (Pearson):
Unadjusted chi2(1) = 66.6589
Adjusted F(1, 11844) = 66.6589 P = 0.0000
*/
I also googled this problem generally but didn't find any relevant websites.
If this is not an appropriate approach other suggestions would be appreciated.
Thank you very much for your help.
I have complex survey data and wish to test if there is a difference in proportions between two sub-populations.
In the survey people were asked if they agreed or not with an opinion, very few people agreed. The problem is that for sub-population B the proportion disagreeing is 1, for sub-population A it is 0.9973. The two sub-populations sizes are n= 11443 and n=417.
So I looked in the https://www.stata.com/manuals13/svy.pdf which seems to suggest that Rao and Scott's (1984) second-order corrected Pearson statistic is the best test for sparse tables and they recommended using this statistic in all situations. The Pearson test found no significant difference between groups. I also outputted the Wald test which found a significant difference (from the Stata help page, it is mentioned that the Wald test can give erratic results for sparse tables). However, as there a cell with a zero proportion/count and the assumptions are not met for a Chi-squared test and I am not confident in the conclusion from the Pearson test. Is the Pearson test the right way to go for this data?
Code:
svy : tab row column,col obs pearson wald
/*
Number of strata = 16 Number of obs = 11,860
Number of PSUs = 11,860 Population size = 3,189,540
N. of poststrata = 40 Design df = 11,844
-------------------------------------
| Column
Row | A B Total
----------+--------------------------
0 | .9934 1 .9936
| 1.1e+04 417 1.2e+04
|
1 | .0066 0 .0064
| 69 0 69
|
Total | 1 1 1
| 1.1e+04 417 1.2e+04
-------------------------------------
Key: column proportion
number of observations
Pearson:
Uncorrected chi2(1) = 2.6635
Design-based F(1, 11844) = 2.7560 P = 0.0969
Wald (Pearson):
Unadjusted chi2(1) = 66.6589
Adjusted F(1, 11844) = 66.6589 P = 0.0000
*/
I also googled this problem generally but didn't find any relevant websites.
If this is not an appropriate approach other suggestions would be appreciated.
Thank you very much for your help.
Comment