Working with frequency tables?

Welch Suggs

Join Date: Apr 2015

Posts: 17
#1

Working with frequency tables?

13 Jun 2017, 13:05

This is more of a statistics problem than a State-specific problem, but I’m trying to use Stata to solve it. I have Likert-style survey data from a total population and a subpopulation. Unfortunately I do not have actual observations; instead I have frequency counts of responses.
frequency of 5 4 3 2 1

x1 432 107 10 0 4

x2 408 120 10 9 7

x3 444 99 8 0 3

x4 425 107 11 4 4

x5 276 194 43 34 7

and subpopulation
frequency of 5 4 3 2 1

x1 28 9 1 0 1

x2 26 10 0 0 3

x3 28 10 0 0 1

x4 26 9 2 1 1

x5 18 15 2 2 2

My goal is to find variables with statistically significant differences between the subpopulation and the overall population, i.e. where I can reject the null hypothesis that the subpopulation is a random draw from the total population. The data are not normally distributed, but obviously are ordinal. Does anyone have any advice on how to work with these? I can calculate means and standard deviations, but I’m trying to figure out if there’s a way to break the frequency counts into individual observations so I could run a Wilcoxon signed rank sum test or similar on x1(population) = x1(subpopulation) or some such. Thanks in advance!

Last edited by Welch Suggs; 13 Jun 2017, 13:08.
Tags: None
William Lisowski

Join Date: Dec 2014

Posts: 10150
#2

13 Jun 2017, 13:19

Other will respond, I hope, with advice on methodology. I just want to make the point that, in theory, you should not need to expand your "value&count" observations into "count" observations each with "value" observed. "Frequency weights" are designed to take care of this, and if you find a test that does what you need, the chances are good that it will support frequency weights. Start with help weights for more information on frequency weights, among others.
Comment
Welch Suggs

Join Date: Apr 2015

Posts: 17
#3

14 Jun 2017, 04:35

William Lisowski Thanks for the suggestion! Others?
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#4

14 Jun 2017, 05:24

William suggested an excellent approach on how to tackle aggregated data in Stata.

With regards to comparing a sample with a subsample,I fear that core assumptions concerning benchmark tests would be eventually violated.

Maybe a simulation technique, such as random sampling or bootstrapping or permutation could provide start-up information on whether there might be a difference between the overall sample and this given subsample.

Best regards,

Marcos
1 like
Comment

frequency of	5	4	3	2	1
x1	432	107	10	0	4
x2	408	120	10	9	7
x3	444	99	8	0	3
x4	425	107	11	4	4
x5	276	194	43	34	7

Announcement

Working with frequency tables?

Comment

Comment

Comment