Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Kruskal Wallis test

    Dear Statalist users,

    I'm using Kruskal Wallis test as a non-parametric test for comparing three groups of analysis (P, A and B). My dependent variable is a measure of efficiency, with values that range from 0 to 1.
    However I am not quite understanding the obtained output in what concerns the "rank sum" column, which is as follows.
    Further, is it correct to say that the results indicate that there is a statistically significant difference among the three type of firms?

    Code:
    . kwallis theta if Y==2014, by(type)
    
    Kruskal-Wallis equality-of-populations    rank test
    
    +-------------------------+
    type  Obs   Rank Sum
    -------+-----+-----------
    P     411  109374.50
    A     223  111253.50
    B     190  119272.00
    +-------------------------+
    
    chi-squared =   340.233 with 2 d.f.
    probability =     0.0001
    
    chi-squared with ties =   340.234 with    2 d.f.
    probability =     0.0001
    Thanks,
    Maria

  • #2
    That's significant. To get a better idea of how Kruskal-Wallis sees your data, calculate the ranks and plot their distributions, e.g. with -dotplot- or -stripplot- (SSC).

    Comment


    • #3
      I'm not quite sure if the following codes are what you meant, and the correct way to understand the relationship!?

      Code:
      egen rank_theta=rank( -theta ), by(Y) track
      The obtained graphs are as follows by using stripplot:
      Code:
      stripplot rank_theta if Y==2014, by(type) xlabel(, format(%9.2f))
      Click image for larger version

Name:	stripplot_rank.png
Views:	1
Size:	10.2 KB
ID:	1310661


      or dotplot:
      Code:
      dotplot rank_theta , over(type) center median bar
      Click image for larger version

Name:	dotplot_rank.png
Views:	1
Size:	12.3 KB
ID:	1310662



      Thanks,
      Maria

      Comment


      • #4
        Hello Maria,

        Your output, as Nick remarked, presents statistical significance. In short, there is "some sort of difference" between groups. If the groups have a simillar pattern of distribution of the dependent variable, I gather we can underline there is a difference related to the median. Otherwise, the difference is fundamentally related to the sum of the ranks. To spot which group is different from the others, you may want to perform post hoc comparisons, taking into account the issue of familywise error.
        Best,
        Marcos
        Best regards,

        Marcos

        Comment


        • #5
          Presumably you should apply the same condition to all graphs:

          Code:
          stripplot rank_theta if Y==2014, over(type) vertical box center cumul 
          
          dotplot rank_theta if Y==2014, over(type) center median bar

          Comment


          • #6
            I appreciate both insights.

            Following Marcos' suggestion I runned Dunn's test after a Kruskal–Wallis test.

            According to Nick's suggestion the previous code within -stripplot- gives a more clear picture of the outcomes.

            Comment


            • #7
              I am confused about the Kruskal-Wallis test. I have an ordinal outcome variable for food security which has three levels (food secure, At risk of hunger, food insecure). My independent variable is sex where male (1) and female (2). I have already found the prevalence of food security in the two different sexes. Now how do I compare the prevalence among males only and again compare the prevalence among females only? and is a Kruskal-Wallis appropriate in this case?
              Here is the code and output:
              kwallis Sex if AGE_VQ_P>=20 & Sex==1, by(Hunger_cat)

              Kruskal-Wallis equality-of-populations rank test

              +------------------------------------------------+
              | Hunger_cat | Obs | Rank Sum |
              |-----------------------+---------+--------------|
              | Food secure | 1,840 | 3.61e+06 |
              | At risk of hunger | 960 | 1.88e+06 |
              | Experience hunger | 1,119 | 2.19e+06 |
              +------------------------------------------------+

              chi-squared = -0.000 with 2 d.f.
              probability = 1.0000

              chi-squared with ties = . with 2 d.f.
              probability = 0.0001

              Comment


              • #8
                If interested in #7 please follow thread at https://www.statalist.org/forums/for...ing-prevalence

                Comment

                Working...
                X