Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • tab by order of frequency across multiple variables

    Hi there
    If I want to tab values by order of frequency for one variable, I simply do this:
    tab var1, sort

    But what if I want to tab across multiple variables var1 to var7?
    One way might be to restructure my data from wide to long, so that I have var1 to var7 listed as observations.
    Can anyone suggest a nice alternative that doesn't involve having to restructure my data?

    Any thoughts much appreciated!

    With thanks
    Tim

  • #2

    To clarify, I simply want a list of values by order of frequency, across var1-var7.

    Comment


    • #3
      Well, I don't understand what "a list of values by order of frequency, across var1-var7" means. Can you show an example of what the results you want might look like?

      Comment


      • #4

        Apologies.

        OK, so my dataset looks like this:

        var1 var2 var3
        1 5 3
        4 4 3
        3 7 4
        5 8 3
        6 2 3
        I want to know which values come up most frequently across all these variables, so I get a list of results like this:

        3 comes up 4 times
        4 comes up 3 times
        5 comes up 2 times
        etc

        I hope that clarifies!

        Thanks
        Tim

        Comment


        • #5
          Look at tabm from

          Code:
          ssc inst tab_chi

          Comment


          • #6

            ...except that, in my example, the value "3" actually comes up 5 times (evidently I can't count!)

            Comment


            • #7
              A couple of ways to get this:


              1. Series of count commands
              Code:
              clear
              input float(var1 var2 var3)
              1 5 3
              4 4 3
              3 7 4
              5 8 3
              6 2 3
              end
              
              count if var1==1 | var2==1 | var3==1
              count if var1==2 | var2==2 | var3==2

              2. Temporarily stack your variables into one long variable and tabulate

              Code:
              clear
              input float(var1 var2 var3)
              1 5 3
              4 4 3
              3 7 4
              5 8 3
              6 2 3
              end
              
              preserve
              stack var1 var2 var3, into(_onevar)
              list
              tab _onevar
              restore
              Stata/MP 14.1 (64-bit x86-64)
              Revision 19 May 2016
              Win 8.1

              Comment


              • #8
                Nick Cox -- tabm is a neat solution!
                Stata/MP 14.1 (64-bit x86-64)
                Revision 19 May 2016
                Win 8.1

                Comment


                • #9
                  Thank you all very much.

                  I went for a slightly modified version of Carole's second suggestion, as follows:

                  preserve
                  stack var1-var7, into(tempvar)
                  tab tempvar, sort
                  restore


                  I'd not used the "stack" command before, but it was a neat alternative to "reshape" on this occasion.

                  I experimented with tabm, but this command didn't seem to like the large number of different values in my data, nor was I able to get it to sort the values by frequency.

                  Thanks again
                  Tim

                  Comment


                  • #10
                    As tabm is an implementation of precisely the strategy you adopted, there should be no real problem.

                    tabm by default will put the distinct values as columns; in your case you really need that to be rows. That's why there's a transpose option.

                    Code:
                     
                    tabm var1-var7, transpose

                    Comment

                    Working...
                    X