Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Crosstabs automation

    I am in the process of fully automating a code process. i have 10 different variables and first i have to find the 4 that represents the highest number of cases. that has already been done, now i need to make a 4 way cross tab. the normal code would be something like table (var 1 var 2) (var3 var4) ()
    but i need it to work no matter what four variables remain of the 10.
    hope that makes sense otherwise feel free to ask any question.

  • #2
    table (var 1 var 2) (var3 var4) ()
    I do not know if this syntax is legal, but to answer your question you can use ds and tokenize.

    Code:
    qui ds
    tokenize `r(varlist)'
    table (`1'  `2') (`3' `4') ()

    Comment


    • #3
      i don't know where the error comes, but it just gives me the frequency.
      Code:
      qui ds
      tokeniz `r(var f4 - f14)'
      table (`1' `2') (`3' `4') ()

      Comment


      • #4
        Andrew Musau's suggestion was to be taken literally. After ds, the returned result r(varlist) is accessible. I think Andrew is imagining that you have your variable names in the desired order and want to pick the first four.

        Code:
        . sysuse auto, clear
        (1978 automobile data)
        
        . ds
        make          mpg           headroom      weight        turn          gear_ratio
        price         rep78         trunk         length        displacement  foreign
        
        . return list
        
        macros:
                    r(varlist) : "make price mpg rep78 headroom trunk weight length turn displacement gea.."
        
        . tokenize "`r(varlist)'"
        
        . di "`1' `2' `3' `4'"
        make price mpg rep78
        There is no meaning to r(var f4 - f14) even if you have variables named f4 - f14. For one reason, Stata names never include spaces. For another, easier to see, r(varlist) is the only stored result after ds. See help ds and scroll to the end.

        Comment


        • #5
          I am a bit unclear on what you want here, but interpret "number of cases" as number with non-missing values, as on the usual interpretation of "number of observations" all variables will score the same.

          This code is a bit ad hoc but presupposes that

          1. You have at least 10 observations, which seems likely.

          2. Your variable names are f1 to f10, which can be coded around, which I do later.

          Code:
          clear 
          
          * sandbox dataset
          
          set seed 314159265 
          set obs 10 
          
          quietly forval j = 1/10 {
              gen f`j' = cond(runiform() < (15 - `j')/15, ., runiformint(1, 5))
          }
          
          list 
          
          * count missings
          gen nmissing = .
          quietly forval j = 1/10 {
              count if missing(f`j')
              replace nmissing = r(N) in `j'
          }
          
          gen which = "f" + strofreal(_n) in 1/10 
          
          sort nmissing 
          
          forval j = 1/4 {
              local `j' = which[`j']
          }
          
          list 
          
          table (`1' `2') (`3' `4'), missing 
          
          list `1' `2' `3' `4'
          Main results:

          Code:
          . table (`1' `2') (`3' `4'), missing 
          
          ----------------------------------------------------------------------------------------
                      |                                      f7                                   
                      |      2           4           5               .                 Total      
                      |      f4          f4          f4              f4                  f4       
                      |  .   Total   .   Total   5   Total   4   5   .   Total   4   5   .   Total
          ------------+---------------------------------------------------------------------------
          f10         |                                                                           
            1         |                                                                           
              f8      |                                                                           
                .     |  1       1                                                       1       1
                Total |  1       1                                                       1       1
            2         |                                                                           
              f8      |                                                                           
                5     |              1       1                                           1       1
                Total |              1       1                                           1       1
            3         |                                                                           
              f8      |                                                                           
                4     |                                          1           1       1           1
                Total |                                          1           1       1           1
            4         |                                                                           
              f8      |                                                                           
                4     |                                              1       1           1       1
                .     |                          1       1   1               1   1   1           2
                Total |                          1       1   1       1       2   1   1   1       3
            5         |                                                                           
              f8      |                                                                           
                5     |  1       1                                                       1       1
                Total |  1       1                                                       1       1
            .         |                                                                           
              f8      |                                                                           
                2     |                                              1       1           1       1
                5     |                                      1               1   1               1
                .     |                                              1       1           1       1
                Total |                                      1       2       3   1       2       3
            Total     |                                                                           
              f8      |                                                                           
                2     |                                              1       1           1       1
                4     |                                          1   1       2       1   1       2
                5     |  1       1   1       1               1               1   1       2       3
                .     |  1       1               1       1   1       1       2   1   1   2       4
                Total |  2       2   1       1   1       1   2   1   3       6   2   2   6      10
          ----------------------------------------------------------------------------------------
          
          . 
          . list `1' `2' `3' `4'
          
               +--------------------+
               | f10   f8   f7   f4 |
               |--------------------|
            1. |   4    4    .    . |
            2. |   4    .    .    4 |
            3. |   .    2    .    . |
            4. |   1    .    2    . |
            5. |   .    5    .    4 |
               |--------------------|
            6. |   4    .    5    5 |
            7. |   2    5    4    . |
            8. |   .    .    .    . |
            9. |   5    5    2    . |
           10. |   3    4    .    5 |
               +--------------------+
          The main idea is to put the number of missings in a new variable -- not aligned with the rest of the data --- and sort on that and then read off the variable names required. As the whole point as I understand it is to cope with the presence of missing values, you may need the missing option to get a good result out of table. The output looks complicated to me, but you may be dealing with something simpler, such as a bundle of indicator variables. In either case I would probably recommend a different tabulation, but speculation is idle without more detail.

          There are more elegant ways to do this.

          But I go no further than a more general version that is not prescriptive of variable names.

          Code:
          clear 
          
          * sandbox dataset
          
          set seed 314159265 
          set obs 10 
          
          quietly forval j = 1/10 {
              gen f`j' = cond(runiform() < (15 - `j')/15, ., runiformint(1, 5))
          }
          
          * you start here with your own list of variable names 
          local myvars f1 f2 f3 f4 f5 f6 f7 f8 f9 f10
          
          tokenize "`myvars'"
          
          * count missings
          gen nmissing = .
          gen which = "" 
          
          quietly forval j = 1/10 {
              count if missing(``j'')
              replace which = "``j''" in `j'
              replace nmissing = r(N) in `j'
          }
          
          sort nmissing 
          
          forval j = 1/4 {
              local `j' = which[`j']
          }
          
          list 
          
          table (`1' `2') (`3' `4'), missing 
          
          list `1' `2' `3' `4'

          Comment


          • #6
            the purpose of the code is to make a matrix i guess, which shows the 16 different combinations of my variables (being present or not) it is running and working with Andrew Musaus code.
            can i drop the total values on the end of my table, it's adding some visual cluster?

            Comment


            • #7
              Surely you can and should drop the extra variables -- once they are no longer useful -- to avoid clutter.

              The hint that your variables are really indicators -- else why would there be 16 possible combinations of 4 variables? -- suggests an alternative tabulation using groups from the Stata Journal.

              Here is a different example:

              Code:
              clear 
              
              * sandbox dataset
              
              set seed 314159265 
              set obs 10 
              
              quietly forval j = 1/10 {
                  gen f`j' = cond(runiform() < (15 - `j')/15, ., runiform() > `j'/20)
              }
              
              * you start here with your own list of variable names 
              local myvars f1 f2 f3 f4 f5 f6 f7 f8 f9 f10
              
              tokenize "`myvars'"
              
              * count missings
              gen nmissing = .
              gen which = "" 
              
              quietly forval j = 1/10 {
                  count if missing(``j'')
                  replace which = "``j''" in `j'
                  replace nmissing = r(N) in `j'
              }
              
              sort nmissing 
              
              forval j = 1/4 {
                  local `j' = which[`j']
              }
              
              list 
              
              drop nmissing which 
              
              table (`1' `2') (`3' `4'), missing 
              
              list `1' `2' `3' `4'
              
              groups `1' `2' `3' `4', missing order(high)

              Code:
               
              
              . groups `1' `2' `3' `4', missing order(high)
              
                +--------------------------------------+
                | f10   f9   f6   f8   Freq.   Percent |
                |--------------------------------------|
                |   1    .    1    1       2     20.00 |
                |   0    0    1    0       1     10.00 |
                |   0    0    1    1       1     10.00 |
                |   0    0    1    .       1     10.00 |
                |   0    1    0    .       1     10.00 |
                |--------------------------------------|
                |   1    0    .    0       1     10.00 |
                |   .    1    .    0       1     10.00 |
                |   .    1    .    .       1     10.00 |
                |   .    .    0    .       1     10.00 |
                +--------------------------------------+

              Comment

              Working...
              X