Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • cross tabulation with combinations of binary variables

    Dear Stata Users,

    I have 20 variables, all binary coded, and I would like to tabulate the frequencies of every possible pairwise combination of those variables.

    For illustration purposes, let's assume I only have 3 variables.

    I would like an output table that looks like this:
    Var1 Var2 Var3
    Var1 52 85 6
    Var2 85 63 5
    Var3 6 5 34

    The main diagonal includes the total number of observations (rows in the data) in which var1 is equal to 1. Also, 85 means that there are 85 rows in the data in which var2 is equal to 1 and var1 is equal to 1.

    Is there a mechanical way to acomplish this in Stata?
    Thanks!
    Ana

  • #2
    Ana, I have been facing the same issue like yours. And with my limited knowledge, it seems there are no direct command to get the target. I just share my quite-long loop, at least it works for my demand. Hope it might be useful for you.

    Beside, I also hope for better advise from well experienced masters in the Forum.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(var1 var2 var3)
    0 1 1
    1 1 1
    1 0 1
    0 0 1
    1 1 0
    1 1 0
    1 0 0
    1 0 0
    0 0 0
    end
    
    local varlist = ""
    
        forval i=1/3 {
        local varlist = "`varlist'" +" " + "var`i'_"
        forval j=1/3 {
    
        gen var`i'_`j'= var`i'& var`j'
        replace var`i'_`j' = sum(var`i'_`j')
        }
        }
        
        keep if _n==_N
        drop var1 var2 var3
        gen check=1
        reshape long `varlist', i(check) j(varcode)
        gen varname = "var_" + string(varcode), before(var1_)
        drop varcode check
    Last edited by Romalpa Akzo; 26 Oct 2017, 02:36.

    Comment


    • #3
      no loop is needed as StataCorp supplies the answer; type
      Code:
      help tab2

      Comment


      • #4
        Romalpa: This was very helpful and produced exactly what I was looking for.

        Rich: I believe tab2 is not able to produce a final output table as outlined above

        Thanks to all for the help!

        Comment


        • #5
          Buiilding on Romalpa's helpful example, here is how to do it by compiling counts in a matrix. I've also shown how to generalize the code somewhat.


          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input float(var1 var2 var3)
          0 1 1
          1 1 1
          1 0 1
          0 0 1
          1 1 0
          1 1 0
          1 0 0
          1 0 0
          0 0 0
          end
          
          matrix M = J(3,3,.)
          
          quietly forval i = 1/3 {
              forval j = 1/3 {
                  count if var`i' == 1 & var`j' == 1
                  matrix M[`i', `j'] = r(N)
              }
          }
          
          matrix li M
          
          * to generalize further start with a varlist in local varlist
          
          rename (var*) (frog toad newt)
          
          local varlist frog toad newt
          tokenize "`varlist'"
          local I : word count `varlist'
          
          matrix N = J(`I', `I', .)
          
          quietly forval i = 1/`I' {
              forval j = 1/`I' {
                  count if ``i'' == 1 & ``j'' == 1
                  matrix N[`i', `j'] = r(N)
              }
          }
          
          matrix rownames N = `varlist'
          matrix colnames N = `varlist'
          
          mat li N
          
          symmetric N[3,3]
                frog  toad  newt
          frog     6
          toad     3     4
          newt     2     2     4
          If you want the matrix to be new data, reach for svmat

          Comment


          • #6
            Further generalisations to a program that counts equal pairs across a list of numeric variables. The value concerned is fixed and defaults to 1. As the matrix is symmetric we can reduce the number of counts.

            Code:
            program countequalpairs, rclass 
                version 8.2 
                syntax varlist(numeric) [if] [in] [, value(int 1) noHEADER *] 
                
                marksample touse 
                quietly count if `touse' 
                if r(N) == 0 error 2000 
                
                tokenize "`varlist'"
                local I : word count `varlist'
                tempname N 
                matrix `N' = J(`I', `I', .)
            
                quietly forval i = 1/`I' {
                    forval j = 1/`i' {
                        count if `touse' & ``i'' == `value' & ``j'' == `value' 
                        matrix `N'[`i', `j'] = r(N)
                        matrix `N'[`j', `i'] = r(N) 
                    }  
                }
            
                matrix rownames `N' = `varlist'
                matrix colnames `N' = `varlist'
                mat li `N', noheader `option' 
                return matrix counts = `N'
            end
            Example with the data from #2 read in

            Code:
            . countequalpairs var1 var2 var3
            
                  var1  var2  var3
            var1     6
            var2     3     4
            var3     2     2     4
            
            . ret li
            
            matrices:
                         r(counts) :  3 x 3
            
            . mat counts = r(counts)
            
            . mat li counts
            
            symmetric counts[3,3]
                  var1  var2  var3
            var1     6
            var2     3     4
            var3     2     2     4

            Comment


            • #7
              This is great! Thanks Nick!

              Comment

              Working...
              X