Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Working with frequency tables?

    This is more of a statistics problem than a State-specific problem, but I’m trying to use Stata to solve it. I have Likert-style survey data from a total population and a subpopulation. Unfortunately I do not have actual observations; instead I have frequency counts of responses.
    frequency of 5 4 3 2 1
    x1 432 107 10 0 4
    x2 408 120 10 9 7
    x3 444 99 8 0 3
    x4 425 107 11 4 4
    x5 276 194 43 34 7

    and subpopulation
    frequency of 5 4 3 2 1
    x1 28 9 1 0 1
    x2 26 10 0 0 3
    x3 28 10 0 0 1
    x4 26 9 2 1 1
    x5 18 15 2 2 2
    My goal is to find variables with statistically significant differences between the subpopulation and the overall population, i.e. where I can reject the null hypothesis that the subpopulation is a random draw from the total population. The data are not normally distributed, but obviously are ordinal. Does anyone have any advice on how to work with these? I can calculate means and standard deviations, but I’m trying to figure out if there’s a way to break the frequency counts into individual observations so I could run a Wilcoxon signed rank sum test or similar on x1(population) = x1(subpopulation) or some such. Thanks in advance!
    Last edited by Welch Suggs; 13 Jun 2017, 13:08.

  • #2
    Other will respond, I hope, with advice on methodology. I just want to make the point that, in theory, you should not need to expand your "value&count" observations into "count" observations each with "value" observed. "Frequency weights" are designed to take care of this, and if you find a test that does what you need, the chances are good that it will support frequency weights. Start with help weights for more information on frequency weights, among others.

    Comment


    • #3
      William Lisowski Thanks for the suggestion! Others?

      Comment


      • #4
        William suggested an excellent approach on how to tackle aggregated data in Stata.

        With regards to comparing a sample with a subsample,I fear that core assumptions concerning benchmark tests would be eventually violated.

        Maybe a simulation technique, such as random sampling or bootstrapping or permutation could provide start-up information on whether there might be a difference between the overall sample and this given subsample.
        Best regards,

        Marcos

        Comment

        Working...
        X