Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using collect to combine row percentages for tables assembled with foreach

    I've recently started using collect and have found multiple uses. I can't get this problem solved.

    I have a set of likert scale variables with the same 1 to 5 scales that I want to use as rows. I want to stack them one on top of the other and show row percentages for the entire data and for subsets of the data (cars that are domestic, heavier or longer) as in example table below. Column headers would thus be the five levels of the scales. Both variable names (rows) and variable values (columns) are labeled. It would be great to be able to add set headers (i.e. "Heavy cars") to variables with an if statement. See code example below.

    I've been able to setup a variety of tables. Thee is something I don't quite grasp with the -across- command.

    Somehow my -append- does not work; I know I need to remove the variable title for my scales from column headers, and bring it down as a row name (my code currently shows -percent- as row header or -domestic/foreign- depending on the tables.
    Thanks.
    row percentages 1 2 3 4 5
    rep78 2.9 11.6 43.5 26.1 15.9
    rep78b 2.9 11.6 43.5 26.1 15.9
    rep78c 2.9 11.6 43.5 26.1 15.9
    Domestic cars
    rep78 2.9 11.6 43.5 26.1 15.9
    rep78b 2.9 11.6 43.5 26.1 15.9
    rep78c 2.9 11.6 43.5 26.1 15.9
    Heavy cars
    rep78 2.9 11.6 43.5 26.1 15.9
    rep78b 2.9 11.6 43.5 26.1 15.9
    rep78c 2.9 11.6 43.5 26.1 15.9
    Long cars
    rep78 2.9 11.6 43.5 26.1 15.9
    rep78b 2.9 11.6 43.5 26.1 15.9
    rep78c 2.9 11.6 43.5 26.1 15.9
    Code:
    sysuse auto
    
    gen rep78b= rep78 
    gen rep78c= rep78
     
    collect clear
    
    local varlist rep78 rep78b rep78c
    
    foreach var of local varlist {
    table () (`var') , statistic(percent, across(`var')) ///
    totals(`var') nformat(%5.1f percent) name(oldcar)  append
         }
    
    table (foreign) (rep78), statistic(percent, across(rep78)) name(oldcar)  append
    table () (rep78) if price >6500, statistic(percent, across(rep78)) name(oldcar)  append
    table () (rep78) if length >200, statistic(percent, across(rep78)) name(oldcar)  append
    
    collect style header rep78 rep78c rep78c, title(hide)

  • #2
    This might be of help:

    Code:
    collect clear
    sysuse auto, clear
    clonevar rep78b = rep78
    clonevar rep78c = rep78
    
    dtable, factor(rep78 rep78b rep78c, stat(fvperc)) sformat("%s") nosample name(full)
    collect addtags group[full]
    dtable if foreign == 0, factor(rep78 rep78b rep78c, stat(fvperc)) sformat("%s") nosample name(domestic)
    collect addtags group[domestic]
    dtable if price > 4000, factor(rep78 rep78b rep78c, stat(fvperc)) sformat("%s") nosample name(expensive)
    collect addtags group[expensive]
    dtable if length > 100, factor(rep78 rep78b rep78c, stat(fvperc)) sformat("%s") nosample name(large)
    collect addtags group[large]
    
    collect combine all = full domestic expensive large
    
    collect addtags variable[rep78], fortags(var[1.rep78 2.rep78 3.rep78 4.rep78 5.rep78])
    collect addtags variable[rep78b], fortags(var[1.rep78b 2.rep78b 3.rep78b 4.rep78b 5.rep78b])
    collect addtags variable[rep78c], fortags(var[1.rep78c 2.rep78c 3.rep78c 4.rep78c 5.rep78c])
    collect remap var[1.rep78 1.rep78b 1.rep78c] = replevels[1 1 1]
    collect remap var[2.rep78 2.rep78b 2.rep78c] = replevels[2 2 2]
    collect remap var[3.rep78 3.rep78b 3.rep78c] = replevels[3 3 3]
    collect remap var[4.rep78 4.rep78b 4.rep78c] = replevels[4 4 4]
    collect remap var[5.rep78 5.rep78b 5.rep78c] = replevels[5 5 5]
    collect label dim replevels "Repair Levels"
    
    collect label levels group ///
        full "All cars" ///
        domestic "Domestic cars" ///
        expensive "Expensive cars" ///
        large "Large cars"
    
    collect style header replevels, title(label)
    
    collect layout (group[full domestic expensive large]#variable) (replevels)
    which produces:

    Code:
    . collect preview
    
    ------------------------------------------------
                             Repair Levels          
                     1      2      3      4      5  
    ------------------------------------------------
    All cars                                        
      rep78        2.899 11.594 43.478 26.087 15.942
      rep78b       2.899 11.594 43.478 26.087 15.942
      rep78c       2.899 11.594 43.478 26.087 15.942
    Domestic cars                                  
      rep78        4.167 16.667 56.250 18.750  4.167
      rep78b       4.167 16.667 56.250 18.750  4.167
      rep78c       4.167 16.667 56.250 18.750  4.167
    Expensive cars                                  
      rep78        3.390 11.864 44.068 27.119 13.559
      rep78b       3.390 11.864 44.068 27.119 13.559
      rep78c       3.390 11.864 44.068 27.119 13.559
    Large cars                                      
      rep78        2.899 11.594 43.478 26.087 15.942
      rep78b       2.899 11.594 43.478 26.087 15.942
      rep78c       2.899 11.594 43.478 26.087 15.942
    ------------------------------------------------
    Last edited by Hemanshu Kumar; 07 Feb 2025, 11:16.

    Comment


    • #3
      Hemanshu shows how to use dtable (new in Stata 18) to build the requested table.

      If you are using Stata 17, you can use the table command to accomplish the task.

      Note that, instead of using append, Hemanshu and I are using collect combine, building similar but separate collections that we then combine, style, and arrange to suite our needs. In fact, when you use collect combine, the system adds a new collection dimension that you can label and use in the layout to identify your "groups".

      Code:
      set seed 17
      sysuse auto
      
      forvalues i = 1/3 {
          generate byte x`i' = runiformint(1,5)
      }
      
      unab varlist : rep x?
      
      * full sample of cars
      table, statistic(fvpercent `varlist') ///
          nformat(%5.1f fvpercent) ///
          name(full)
      
      * look at the dimensions defined by our call to -table-, they will be
      * common for all subsequent calls that simply add an -if- condition
      collect dims
      
      table if foreign==1, statistic(fvpercent `varlist') ///
          nformat(%5.1f fvpercent) ///
          name(foreign)
      
      table if foreign==0, statistic(fvpercent `varlist') ///
          nformat(%5.1f fvpercent) ///
          name(domestic)
      
      table if price > 6500, statistic(fvpercent `varlist') ///
          nformat(%5.1f fvpercent) ///
          name(expensive)
      
      table if length > 200, statistic(fvpercent `varlist') ///
          nformat(%5.1f fvpercent) ///
          name(long)
      
      collect combine all = full foreign domestic expensive long
      
      * new dimension named -collection- identifies which collection an item
      * came from
      collect dims
      
      * add nice labels for the -collection- levels
      collect label levels collection ///
          full "All cars" ///
          foreign "Foreign cars" ///
          domestic "Domestic cars" ///
          expensive "Expensive cars" ///
          long "Long cars"
      
      * the requested layout is to show the factor variable names in the row
      * header and their levels in the column header
      
      * row headers -- note that dimensions -var- and -colname- contain the
      * same information, we will use the -colname- tag information to strip
      * the factor variable levels and leave the factor variable names in
      * -var-; -var- will be used in the row specification of the layout
      
      * column headers -- note that each factor variable is also present as a
      * dimension in each collection, so we will remap (rename) these
      * dimensions to a common name (say -category-) that will be used in the
      * column specification of the layout
      
      foreach var of local varlist {
          * strip the factor variable level from the -var- tag elements
          * for the row specification of the layout
          collect addtags var[`var'], fortags(colname[i.`var']) replace
      
          * remap the factor variables to a command name for the column
          * specification of the layout
          collect remap `var' = categories
      }
      
      * hide all dimension titles
      collect style header, title(hide)
      
      * arrange the items
      collect layout (collection#var) (categories) (result)
      
      * if you want to see variable names instead of any variable labels in the
      * row headers, use
      collect style header var, level(value)
      collect preview
      
      * if you want to remove the vertical bar
      collect style cell border_block[corner row-header], border(right, pattern(none))
      collect preview
      Here is the final table.
      Code:
      ------------------------------------------------
                         1      2      3      4      5
      ------------------------------------------------
      All cars                                        
        rep78          2.9   11.6   43.5   26.1   15.9
        x1            16.2   10.8   27.0   29.7   16.2
        x2            18.9   16.2   28.4   20.3   16.2
        x3            20.3   21.6   14.9   17.6   25.7
      Foreign cars                                    
        rep78                       14.3   42.9   42.9
        x1            18.2   22.7   40.9   13.6    4.5
        x2            18.2    9.1   31.8   18.2   22.7
        x3            18.2   18.2   13.6   22.7   27.3
      Domestic cars                                  
        rep78          4.2   16.7   56.2   18.8    4.2
        x1            15.4    5.8   21.2   36.5   21.2
        x2            19.2   19.2   26.9   21.2   13.5
        x3            21.2   23.1   15.4   15.4   25.0
      Expensive cars                                  
        rep78                 6.2   43.8   37.5   12.5
        x1            17.6   11.8   35.3   23.5   11.8
        x2            11.8   17.6   47.1   17.6    5.9
        x3            17.6   23.5    5.9   35.3   17.6
      Long cars                                      
        rep78                22.7   45.5   31.8      
        x1             8.7    4.3   17.4   39.1   30.4
        x2            17.4   17.4   30.4    8.7   26.1
        x3            21.7   21.7   17.4   21.7   17.4
      ------------------------------------------------
      Last edited by Jeff Pitblado (StataCorp); 07 Feb 2025, 13:47.

      Comment


      • #4
        This is fantastic! Thank you Jeff and Hemanshu.
        And yes, I'm using Stata17.

        And now I can just add labels to the -remap-ed -categories- and show them as headers.
        I need to read more on -remap- and -addtags-

        Code:
        collect label levels categories ///
            1    "Crappy" ///
            2    "" ///
            3    "" ///
            4    "" ///
            5    "Tip top"
        collect style header categories, level(label)

        Comment

        Working...
        X