Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to compare mean values between groups?

    I have used
    Code:
    egen
    to compute mean value of each group. The next step is that I need to compare such mean values to pinpoint 5 groups which have largest mean values. how to do?

    Many thanks in advance!

  • #2
    The purpose of -egen- is not to summarize variables, but rather to create new variables. (There are ways to use -egen- to do what you want, but no regular Stata user would want to use -egen- that way.)

    There are many standard ways to do what you want. cHere are some:
    Code:
    tabstat YourVariable, by(YourGroup) statistic(mean)
    // or
    mean YourVariable, over(YourGroup)
    // or
    bysort YourGroup: summarize YourVariable

    You would benefit from looking at the commands available under the Summaries, Tables, and Tests item under the Statistics menu.

    Comment


    • #3
      Great, thank you so much! But how to pinpoint 5 top mean?

      Comment


      • #4
        Here's some technique

        Code:
        * sandbox of group means 
        sysuse auto, clear
        gen MAKE = word(make, 1)
        egen mpg2 = mean(mpg), by(MAKE)
        tabdisp MAKE, c(mpg2)
        
        * get top 5 
        egen tag = tag(MAKE)
        gsort -tag -mpg2
        gen rank = _n in 1/5 
        l MAKE rank mpg2 in 1/5, noobs
        
          +----------------------+
          |   MAKE   rank   mpg2 |
          |----------------------|
          | Subaru      1     35 |
          |  Mazda      2     30 |
          |     VW      3   28.5 |
          |  Honda      4   26.5 |
          |  Plym.      5   26.2 |
          +----------------------+

        Comment


        • #5
          My mistake: I neglected to consider that you might have so many groups as to making simply looking at the results of -tabstat- etc. impractical, in which case, yes, a regular user *might* well find -egen- of use, per Nick's posting. -collapse- is another possibility, if you don't mind having your original data set replaced:
          Code:
          sysuse auto
          collapse (mean) price, by(rep78)
          gsort -price
          list in 1/5

          Comment


          • #6
            Thank you very much! Both methods are fantastic! I love this forum.

            Comment

            Working...
            X