Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Assessing differences between ordinal ranks based on frequencies

    Hi Stata users!

    Stata 14.1. I am trying to assess if there is a difference in the ordinal ranks of most frequently diagnosed diseases between men and women, and across different seasons. From my dataex below, we have an id variable (srno), the binary sex variable (sex2: 0 = men; 1 = women), the categorical season variable (seas: 1 = winter; 2 = pre-monsoon; 3 = southwest monsoon; 4 = post-monsoon), and a categorical diagnosis variable, which lists the indicated condition of the individual. I cherry-picked my dataex to include entries from all four seasons.

    [CODE]
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int srno byte sex2 float seas str40 diag
    4734 0 1 "Gastritis"
    4735 1 1 "Upper Respiratory Tract Infection (URTI)"
    4736 0 1 "Cough"
    4737 1 1 "Cold"
    4738 1 1 "Diarrhea"
    6177 0 2 "SOB"
    6178 0 2 "Upper Respiratory Tract Infection (URTI)"
    6179 0 2 "Upper Respiratory Tract Infection (URTI)"
    6180 0 2 "Cold"
    6181 0 2 "Gastritis"
    8089 1 3 "Gastritis"
    8090 1 3 "Tinea"
    8091 0 3 "Gastritis"
    8092 0 3 "Fever"
    8093 0 3 "Low Back Ache"
    10669 0 4 "Tinea"
    10670 0 4 "Icterus"
    10671 0 4 "Cough"
    10672 0 4 "Tinea"

    I can quickly see which diagnoses are made most frequently, across sex and season, by using the tabsort command (ssc install tab_chi), e.g. below for sex, for the top five most frequent diagnoses

    tabsort diag if sex2==0
    RevisedDiagnosis Freq. Percent Cum.
    Cough 1,477 12.05 12.05
    Musculoskeletal Pain 1,327 10.83 22.88
    Road Traffic Accident 1,023 8.35 31.23
    Tinea 981 8.00 39.23
    Cold 795 6.49 45.72
    tabsort diag if sex2==1
    RevisedDiagnosis Freq. Percent Cum.
    Musculoskeletal Pain 530 12.10 12.10
    Cough 490 11.19 23.29
    Cold 349 7.97 31.26
    Fever 322 7.35 38.62
    Gastritis 279 6.37 44.99
    Now, this is both a statistics question and a Stata-istics question (I apologise for the former). I believe I should use the Wilcoxon-Mann Whitney or Kruskal Wallis test to see if there is a difference between these two ranks, but what I don't understand is what form the data should take, in order to make these tests possible. I have successfully used other Statalist posts to create two new variables which list the rank of each diagnosis for males and females, but I don't understand how a single variable could contain the necessary information to make these tests possible. I don't discount I am making some error with regards to choice of test, dependent variable or otherwise. I appreciate any help possible!

    Kind regards,

    Harry

  • #2
    The fact that you can rank diagnoses by frequency doesn't make diagnosis an ordinal variable.

    So, what you want here is more like a chi-square test.

    Comment


    • #3
      Does statistical satiation exist? Is there a German compound phrase for statistical shame?!

      I realise the large oversight in my question. I had thought it possible, statistically, to assess differences between ranks, and even more, which position changes are the most 'influential'. I hadn't really considered whether that is something you should do when you have a categorical dependent variable.

      My sincere thanks anyway. I need a break.

      Comment

      Working...
      X