This is more of a statistics problem than a State-specific problem, but I’m trying to use Stata to solve it. I have Likert-style survey data from a total population and a subpopulation. Unfortunately I do not have actual observations; instead I have frequency counts of responses.
and subpopulation
My goal is to find variables with statistically significant differences between the subpopulation and the overall population, i.e. where I can reject the null hypothesis that the subpopulation is a random draw from the total population. The data are not normally distributed, but obviously are ordinal. Does anyone have any advice on how to work with these? I can calculate means and standard deviations, but I’m trying to figure out if there’s a way to break the frequency counts into individual observations so I could run a Wilcoxon signed rank sum test or similar on x1(population) = x1(subpopulation) or some such. Thanks in advance!
frequency of | 5 | 4 | 3 | 2 | 1 |
x1 | 432 | 107 | 10 | 0 | 4 |
x2 | 408 | 120 | 10 | 9 | 7 |
x3 | 444 | 99 | 8 | 0 | 3 |
x4 | 425 | 107 | 11 | 4 | 4 |
x5 | 276 | 194 | 43 | 34 | 7 |
and subpopulation
frequency of | 5 | 4 | 3 | 2 | 1 |
x1 | 28 | 9 | 1 | 0 | 1 |
x2 | 26 | 10 | 0 | 0 | 3 |
x3 | 28 | 10 | 0 | 0 | 1 |
x4 | 26 | 9 | 2 | 1 | 1 |
x5 | 18 | 15 | 2 | 2 | 2 |
Comment