Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • tabulation with too many values

    I have 60 categories for which I want to portray values of a discrete variable taking 8 values in either tabular form. So tab and tab2 respond with "too many values". It's important for me to identify categories for which I have zero quantity of my variable. With table commands apparently failing because of the number of categories, any ideas about how to proceed?

  • #2
    Although tabulate has a limit on the number of values it will accept, -table- does not:

    Code:
    table category variable_with_8_values
    will run without complaint.

    Comment


    • #3
      Clyde's advice is good, table is a much more versatile command, but 60 categories is within the reach of tabulate as well:

      Code:
      sysuse auto, clear
      generate x=_n
      expand 8
      
      generate y= floor(_n/74)
      tabulate x
      tabulate x y
      So, Charles, may be check your data if your expectations about the number of categories is correct. Inspect and levelsof could be the tools for this task.

      Best, Sergiy Radyakin

      Comment


      • #4
        groups (Stata Journal) has a handle specifically for showing cross-combinations with zero frequency. See https://www.statalist.org/forums/for...updated-on-ssc for some details.

        Comment


        • #5
          Thanks to all of you. I'll try these when I'm back on site with the data.

          Comment


          • #6
            You might also try bigtab by Paul H. Bern, available from SSC.

            Code:
            . ssc describe bigtab
            
            ---------------------------------------------------------------------------------------------------------
            package bigtab from http://fmwww.bc.edu/repec/bocode/b
            ---------------------------------------------------------------------------------------------------------
            
            TITLE
                  'BIGTAB': module to produce frequency tables for "too many values"
            
            DESCRIPTION/AUTHOR(S)
                  
                  The bigtab command helps when you get a "too many values" error
                  from tabulate or want to save the results in a data file.  Users
                  may specify up to   three variables and have the option of
                  displaying and saving cumulative, row, and column   counts and
                  percentages.
                  
                  Distribution-Date: 20030609
                  
                  Author: Paul H. Bern, Princeton University
                  Support: email [email protected]
                  
            
            INSTALLATION FILES                               (type net install bigtab)
                  bigtab.ado
                  bigtab.hlp
            ---------------------------------------------------------------------------------------------------------
            (type ssc install bigtab to install)
            David Radwin
            Senior Researcher, California Competes
            californiacompetes.org
            Pronouns: He/Him

            Comment


            • #7
              Another approach is to collapse the dataset basically to the dimensions of the table:

              Code:
              sysuse nlsw88, clear
              
              collapse (count)freq = idcode, by(wage)
              gsort -freq wage
              
              list
              Results in a dataset of frequencies:

              Code:
              . list
              
                   +-----------------+
                   |     wage   freq |
                   |-----------------|
                1. | 4.025765     48 |
                2. | 5.032206     36 |
                3. |  8.05153     27 |
                4. | 4.830918     25 |
                5. | 5.636071     24 |
                   |-----------------|
                6. | 3.220612     23 |
                7. | 3.344482     22 |
                8. | 4.180602     19 |
                9. | 10.06441     19 |
               10. | 4.227053     18 |
                   |-----------------|
               11. | 7.745568     18 |
               12. | 3.623188     17 |
              ....cut here..................
              959. | 29.72623      1 |
              960. | 30.19324      1 |
                   |-----------------|
              961. | 30.33817      1 |
              962. | 30.92161      1 |
              963. | 30.96618      1 |
              964. | 30.96741      1 |
              965. |  33.4984      1 |
                   |-----------------|
              966. | 39.23074      1 |
              967. | 40.74659      1 |
                   +-----------------+
              Having 48 persons out of 2,246 with same wages up to millionths fractions of a dollar (and not a round number) in a real dataset is, imho, phenomenal. If anyone knows how this has happened, please share.

              Comment

              Working...
              X