Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stata command to summarize data by "mode"

    Hi all,

    I need to create a table to summarize my data. This table should have a column for "mode" in addition to columns for other statistics (e.g., mean, median).

    I used this command for example "tabstat hmembers18 yearschoolI, statistics( mean median )" to get the means and medians. However, I could not find how to get the modes.

    Please advise.

    Thanks

    Quynh

  • #2
    -tabstat- does not include a mode statistic. You can calculate the modes with the -egen, mode()- function: but to use it you will have to decide what to do if a variable has more than one mode. (My guess is that the possible multiplicity of modes is an important reason that -tabstat- doesn't support it.)

    Comment


    • #3
      Modes can indeed be problematic. But various definitions and commands exist including modes (SJ) and hsmode (SSC).

      Comment


      • #4
        Hi Mr Cox, I have a similar question. I am not sure if I am allowed to post on past questions. I am trying to create a description table for my data. For that, I need to depict mode of Occupation of individuals depending by other variable in my table. Or should I just manually calculate what I need and create a table on word?

        Comment


        • #5
          You do not provide enough detail to get a useful answer. But generally what the original poster asked for, and probably what you are asking for can be done with something that I call a "List Table."

          Lets use the auto data as an example, and for the groups defined by rep, I want to show in first column mean price, in second median price, and in third column the mode of price:

          Code:
          . sysuse auto, clear
          (1978 Automobile Data)
          
          . egen meanprice = mean(price), by(rep)
          
          . egen medprice = median(price), by(rep)
          
          . egen modprice = mode(price), by(rep) missing maxmode
          
          . egen tag = tag(rep), missing
          
          . sort rep
          
          . list rep meanprice medprice modprice if tag, sepby(rep) noobs
          
            +----------------------------------------+
            | rep78   meanpr~e   medprice   modprice |
            |----------------------------------------|
            |     1     4564.5     4564.5       4934 |
            |----------------------------------------|
            |     2   5967.625       4638      14500 |
            |----------------------------------------|
            |     3   6429.233       4741      15906 |
            |----------------------------------------|
            |     4     6071.5     5751.5       9735 |
            |----------------------------------------|
            |     5       5913       5397      11995 |
            |----------------------------------------|
            |     .     6430.4       4453      12990 |
            +----------------------------------------+



          Originally posted by Yatharth Garg View Post
          Hi Mr Cox, I have a similar question. I am not sure if I am allowed to post on past questions. I am trying to create a description table for my data. For that, I need to depict mode of Occupation of individuals depending by other variable in my table. Or should I just manually calculate what I need and create a table on word?

          Comment


          • #6
            Though off topic, but I could not resist suggesting asdoc here for exporting the table to MS Word. Joro has a neat solution, which can be easily exported to MS word. We just need to add asdoc to the beginning of the last line of Joro's code. See the following code.
            Code:
            ssc install asdoc
            asdoc list rep meanprice medprice modprice if tag, sepby(rep) noobs replace
            Click image for larger version

Name:	Capture.PNG
Views:	1
Size:	26.9 KB
ID:	1607343



            Regards
            --------------------------------------------------
            Attaullah Shah, PhD.
            Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
            FinTechProfessor.com
            https://asdocx.com
            Check out my asdoc program, which sends outputs to MS Word.
            For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

            Comment


            • #7
              Naturally, using the auto dataset here is just giving a reproducible example. But equally people can see the results and work out what is likely to make more sense for their real problem.

              The mode in this case for repair record 3 leapt out at me. What is going on? There are 30 distinct values, each of which occurs precisely once. So, on a naive reading there are 30 equally plausible candidates for mode.

              Back in #5 Joro Kolev used the maxmode option which explains the result of 15306. In most cases where I am curious about a mode, the distribution is right skewed and the maxmode option gives a poor answer, not that other options help much. (Back in the day (meaning 1999: net stb 50 dm70)when I first wrote mode() I had in mind mostly discrete or categorical variables. It was StataCorp who added minmode maxmode nummode() as options.

              If we take the problem seriously, there are various defensible attitudes in my view.

              0. The case is hopeless.

              1. Fire up kernel density estimation, or some other smoothing method. Look for a (main) mode, always remembering that different kernel widths may imply different answers.

              2. Fit a named distribution which implies a mode, either as a parameter or as implied by other parameters

              3. Use some other method such as given by hsmode (SSC).

              Code:
              . sysuse auto, clear
              (1978 Automobile Data)
              
              . tab price if rep78 == 3
              
                    Price |      Freq.     Percent        Cum.
              ------------+-----------------------------------
                    3,291 |          1        3.33        3.33
                    3,299 |          1        3.33        6.67
                    3,895 |          1        3.33       10.00
                    3,955 |          1        3.33       13.33
                    4,082 |          1        3.33       16.67
                    4,099 |          1        3.33       20.00
                    4,181 |          1        3.33       23.33
                    4,187 |          1        3.33       26.67
                    4,296 |          1        3.33       30.00
                    4,482 |          1        3.33       33.33
                    4,504 |          1        3.33       36.67
                    4,516 |          1        3.33       40.00
                    4,647 |          1        3.33       43.33
                    4,723 |          1        3.33       46.67
                    4,733 |          1        3.33       50.00
                    4,749 |          1        3.33       53.33
                    4,816 |          1        3.33       56.67
                    5,172 |          1        3.33       60.00
                    5,189 |          1        3.33       63.33
                    5,222 |          1        3.33       66.67
                    5,788 |          1        3.33       70.00
                    6,165 |          1        3.33       73.33
                    6,295 |          1        3.33       76.67
                   10,371 |          1        3.33       80.00
                   10,372 |          1        3.33       83.33
                   11,385 |          1        3.33       86.67
                   11,497 |          1        3.33       90.00
                   13,466 |          1        3.33       93.33
                   13,594 |          1        3.33       96.67
                   15,906 |          1        3.33      100.00
              ------------+-----------------------------------
                    Total |         30      100.00
              
              . hsmode price if rep78 == 3
              
              (n = 30)
                         mode
              ---------------
              price      4728
              
              . kdensity price if rep78 == 3, xline(4728)

              Comment

              Working...
              X