Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Issue with collect and table

    Hi everyone,

    I am trying to create a summary statistics table by gender for 2 different variables (in rows) where I would like to have the following columns - mean, standard deviation, number of non-missing values, number of missing values and percentage of missing values. The first three are easy using the statistic but for the last two I have created a new variable using the missing command.

    This is what I run:

    Code:
    webuse nhanes2, clear
    
    gen lead_missing = missing(lead)
    gen zinc_missing = missing(zinc)
    
    
    collect clear 
    table (var) (sex), ///
            statistic(mean lead ) ///
            statistic(sd lead) ///
            statistic(count lead) ///
            statistic(fvfrequency lead_missing) ///
            statistic(fvpercent lead_missing) ///
            nototals ///
            nformat(%6.2f mean) ///
            nformat(%6.2f sd) ///
            sformat("%s%%" fvpercent)
    
    table (var) (sex), ///
            statistic(mean zinc ) ///
            statistic(sd zinc) ///
            statistic(count zinc) ///
            statistic(fvfrequency zinc_missing) ///
            statistic(fvpercent zinc_missing) ///
            nototals ///
            nformat(%6.2f mean) ///
            nformat(%6.2f sd) ///
            sformat("%s%%" fvpercent) append
    
    collect preview
    collect layout (cmdset) (sex[1]#result sex[1]#result#var[ 1.lead_missing 1.zinc_missing] sex[0]#result sex[0]#result#var[ 1.lead_missing 1.zinc_missing])
    This is what it looks:

    Click image for larger version

Name:	Screenshot 2025-04-20 220211.png
Views:	1
Size:	13.5 KB
ID:	1776258



    The problem is that I would like Factor-variable frequency to be in one column (540 below 2538) and Factor-variable percent to be in one column. How can I do that?

    Thank you in advance!!!
    Attached Files

  • #2
    Try, after the table commands:
    Code:
    collect recode var 1.lead_missing = 1.missing 1.zinc_missing = 1.missing        
    collect label levels result sd "Std. Dev." fvfrequency "Missing (N)" fvpercent "Missing (%)", modify
    collect label levels cmdset 1 "Lead" 2 "Zinc", modify
    collect style header cmdset, title(hide)
    collect style header var, level(hide) 
    collect layout (cmdset) (sex[1]#result[mean sd] sex[1]#result#var[1.missing] sex[2]#result[mean sd] sex[2]#result#var[1.missing])
    which produces:
    Code:
    . collect preview
    
    -----------------------------------------------------------------------------------------------------
         |                                               Sex                                             
         |                       Male                                           Female                   
         |   Mean   Std. Dev.   Missing (N)   Missing (%)    Mean   Std. Dev.   Missing (N)   Missing (%)
    -----+-----------------------------------------------------------------------------------------------
    Lead |  16.87        6.58         2,538        51.64%   11.96        4.65         2,865        52.70%
    Zinc |  89.53       15.44           540        10.99%   83.77       12.95           609        11.20%
    -----------------------------------------------------------------------------------------------------

    Comment


    • #3
      Note that with Hemanshu Kumar's collect recode you can simplify the layout to
      Code:
      collect layout (cmdset) (sex#result)
      and get all the results of interest.

      The following adds a slightly shorter label for result[count]
      Code:
      collect label levels result count "Not missing (N)", modify
      to get the following table.
      Code:
      . collect layout (cmdset) (sex#result)
      
      Collection: Table
            Rows: cmdset
         Columns: sex#result
         Table 1: 2 x 10
      
      -----------------------------------------------------------------------------------------------------------------------------------------
           |                                                                 Sex                                                               
           |                                Male                                                             Female                            
           |   Mean   Std. Dev.   Not missing (N)   Missing (N)   Missing (%)    Mean   Std. Dev.   Not missing (N)   Missing (N)   Missing (%)
      -----+-----------------------------------------------------------------------------------------------------------------------------------
      Lead |  16.87        6.58             2,377         2,377        48.36%   11.96        4.65             2,571         2,571        47.30%
      Zinc |  89.53       15.44             4,375         4,375        89.01%   83.77       12.95             4,827         4,827        88.80%
      -----------------------------------------------------------------------------------------------------------------------------------------

      Comment


      • #4
        Thank you Hemanshu Kumar and Jeff Pitblado (StataCorp)! It worked perfectly!

        Comment

        Working...
        X