Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Count combinations of variable values

    I have a set of 5 dummy variables. Call them v1,v2,v3,v4,v5. I would like to know how many times each combination of 5 values occurs. The desired output would look like this.

    v1 v2 v3 v4 v5 Frequency
    1 1 1 1 1 10000
    1 0 1 1 1 3000
    1 0 1 0 1 3000
    <snip>


    This says that there are 10,000 rows where all 5 variables are 1, 3,000 rows where v2=0 and the other four variables are 1, 3,000 rows where v2=v4=0 and v1=v3=v5=1, and so on.

    I can figure out how to write a little program that produces this out, but I'm hoping there's an existing command that does it already. There are simple commands do to this in SAS, SQL, and Unix, so I'm hoping there's also a simple way to do it in Stata.

    Thanks!
    Paul

  • #2
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte(v1 v2 v3 v4 v5) int frequency
    1 1 1 1 1 10000
    1 0 1 1 1  3000
    1 0 1 0 1  3000
    end
    
    egen combo = group(v1 v2 v3 v4 v5), label
    tab combo [fweight = frequency]
    In the future, when showing data examples, please use the -dataex- command to do so, as I have in this reply. If you are running version 15.1 or a fully updated version 14.2, it is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use -dataex-.

    Comment


    • #3
      See also groups from the Stata Journal.

      https://www.stata-journal.com/articl...article=st0496

      https://www.statalist.org/forums/for...updated-on-ssc

      Clyde's code gives the essence of an equivalent idea in two lines. groups adds bells and whistles.

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input byte(v1 v2 v3 v4 v5) int frequency
      1 1 1 1 1 10000
      1 0 1 1 1  3000
      1 0 1 0 1  3000
      end
      
      groups v* [fw=freq] 
      
        +------------------------------------------+
        | v1   v2   v3   v4   v5   Freq.   Percent |
        |------------------------------------------|
        |  1    0    1    0    1    3000     18.75 |
        |  1    0    1    1    1    3000     18.75 |
        |  1    1    1    1    1   10000     62.50 |
        +------------------------------------------+

      Comment

      Working...
      X