Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • using collect to make a table after xtsum

    Dear Stata Forum,

    I would like to use the "collect" command to create a table of results generated by xtsum. Unfortunately, after xtsum, the collect command does not have the dimension called "colname" that usually contains the list of variables names. Furthermore, only the statistics associated with the last variable are saved in the collection.

    Is there some way to make the collect command save the variable names and all the results that show up in the xtsum output? Perhaps I need to run the xtsum command in a loop and then assemble results from multiple collections? Any tips would be appreciated.

    My failed attempt to create the table is below.

    Thanks,

    Jeremy


    Code:
    sysuse bplong.dta
    
    *identify "patient" as the ID variable and "when" as the time variable
    xtset patient when
    
    *get results from xtsum and place them in a collection
    collect: xtsum bp sex
    
    *list the dimensions captured by the collect prefix
    *Problem: there is no dimension called "colname" which typically contains the list of variables
    collect dims 
    
    *the collection only appears to hold the results from the last variable: sex
    collect layout (colname) (result[mean sd sd_b sd_w])

  • #2
    xtsum only stores the results from the last variable, so you will
    have to loop over the variables and add a custom colname tag for
    each call.

    Here is an example based on your original code:
    Code:
    ysuse bplong.dta
    
    *identify "patient" as the ID variable and "when" as the time variable
    xtset patient when
    
    *get results from xtsum and place them in a collection;
    *by looping over the variables
    unab list : bp sex
    foreach var of local list {
            collect, tags(colname[`var']) : xtsum `var'
    }
    
    *list the dimensions captured by the collect prefix
    *notice that dimension called "colname" now exists
    collect dims
    
    *the collection only appears to hold the results from the last variable: sex
    collect layout (colname) (result[mean sd sd_b sd_w])
    Here is the resulting table:
    Code:
    -----------------------------------------------------------------------------
                   |     Mean Overall std. dev Between std. dev. Within std. dev.
    ---------------+-------------------------------------------------------------
    Blood pressure | 153.9042          13.0837          9.773978         8.720797
    Sex            |       .5         .5010449          .5020964                0
    -----------------------------------------------------------------------------

    Comment


    • #3
      Fantastic! Thank you Jeff.

      Can you also tell me how to collect separate estimates for different groups, perhaps men and women?
      Ideally, I would like a table like the one you produced, but columns 1-4 would contain the estimates for men, and columns 5-8 would contain the numbers for women. The rows would still contain estimates from different variables.

      Jeremy

      Comment


      • #4
        That would yield a wide table, but here is one way to do it building on the above code:
        Code:
        sysuse bplong.dta
        
        *identify "patient" as the ID variable and "when" as the time variable
        xtset patient when
        
        *get results from xtsum and place them in a collection;
        *first loop over the levels of a group variable,
        *then loop over the variables
        levelsof sex, local(levels)
        unab list : bp agegrp
        foreach lev of local levels {
            foreach var of local list {
                    collect, tags(sex[`lev'] colname[`var']) ///
                        : xtsum `var' if sex == `lev'
            }
        }
        
        *list the dimensions captured by the collect prefix
        *notice that dimension called "colname" now exists
        collect dims
        
        *shorten some of the labels
        collect label levels result sd "SD" sd_w "Within SD" sd_b "Between SD", modify
        
        *the collection only appears to hold the results from the last variable: sex
        collect layout (colname) (sex#result[mean sd sd_b sd_w])
        Here is the resulting table:
        Code:
        ----------------------------------------------------------------------------------------------
                       |     Male     Male       Male      Male   Female   Female     Female    Female
                       |     Mean       SD Between SD Within SD     Mean       SD Between SD Within SD
        ---------------+------------------------------------------------------------------------------
        Blood pressure | 157.3917 13.54004    9.62962  9.559412 150.4167 11.65944   8.672559  7.833348
        Age group      |        2 .8199201    .823387         0        2 .8199201    .823387         0
        ----------------------------------------------------------------------------------------------

        Comment


        • #5
          Thank you Jeff!

          For others who might be experimenting with these things, running the two commands below before making the table can eliminate unnecessary repetition of the headings and alter the number of digits displayed.

          collect style column, dups(center)
          collect style cell result[mean sd sd_b sd_w], nformat(%8.2f)

          Also, making the table long rather than wide is easy:
          collect layout (sex#colname) (result[mean sd sd_b sd_w]) Jeremy

          Comment


          • #6
            For asdocx users, exporting the results of xtsum command can be accomplished by adding 'asdocx' as a prefix to the xtsum command."
            Code:
              
             sysuse bplong.dta  asdocx xtsum bp agegrp, replace  
                                                     Table: Results
            
            ---+------------------------------------------------------------------------------------------------------
               | Variable                      Mean     Std. Dev.        Min        Max  Observations          
            ---+------------------------------------------------------------------------------------------------------
               |bp              overall    153.904        13.084        125        185          N =        240
               |                between                    9.774      136.5        178          n =        120
               |                 within                    8.721    131.404    176.404      T-bar =          2
               |agegrp          overall          2         0.818          1          3          N =        240
               |                between                     0.82          1          3          n =        120
               |                 within                        0          2          2      T-bar =          2
            -----------------------------------------------------------------------------------------------------------
            Last edited by Attaullah Shah; 09 Feb 2023, 07:49.
            Regards
            --------------------------------------------------
            Attaullah Shah, PhD.
            Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
            FinTechProfessor.com
            https://asdocx.com
            Check out my asdoc program, which sends outputs to MS Word.
            For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

            Comment

            Working...
            X