optimizing the dtable

Sa Fe

Join Date: Jun 2020
Posts: 93

optimizing the dtable

05 Sep 2024, 13:26

I got the following code for generating a table of descriptive statistics. It has some downfalls that I want to address.

Code:

....
collect composite define blank = fvpercent, trim // this is just for adding an empty column between main columns.
...

foreach x of local fvars {
    collect: svy: proportion `x', percent
}
collect composite define CI_tot = _r_lb _r_ub, trim delimiter("-")
collect style cell result[CI_tot], sformat("(%s)") nformat(%2.1fc)
collect style cell result[fvpercent], nformat(%2.1fc) sformat("%s")

collect composite define col5 = fvpercent CI_tot
collect composite define col6 = fvrawfrequency
collect composite define col7 = fvfrequency

collect style autolevels result col1 col2 col3 col5 col6 col7, clear
collect label levels result ///
    col1 "Weighted % (95% CI)" ///
    col2 "Unweighted N" ///
    col3 "Weighted N" ///
    col5 "Weighted % (95% CI)" ///
    col6 "Unweighted N" ///
    col7 "Weighted N" ///
    , modify
collect style header result [col1 col2 col3 blank col5 col6 col7 ], title(hide) level(label)
collect layout (colname) (result[col5 col6 col7 blank] school_level#result[col1 col2 col3 blank] )

collect style cell border_block, border(right, pattern(nil)) // remove vertical line
collect style cell cell_type[column-header], font(, bold) // bold all the column headers
collect style row stack, spacer nobinder
collect preview

This produces the following table.

Click image for larger version

Name: Screenshot 2024-09-05 at 3.20.47 PM.png
Views: 1
Size: 43.8 KB
ID: 1763141

What I actually looking to generate is the following table:

Click image for larger version

Name: Screenshot 2024-09-05 at 3.24.19 PM.png
Views: 1
Size: 31.6 KB
ID: 1763142

My question is how should I revise my code to be able to generate the latter table?

Tags: collect, descriptive stats, dtable, putdocx, summary of stats

Sa Fe

Join Date: Jun 2020

Posts: 93
#2

06 Sep 2024, 07:42

Hi statalist,

Any idea on this? I added some annotation for a better clarity on the differences between the current code's output and what I'm actually looking to produce. Your help would be greatly appreciated.

Last edited by Sa Fe; 06 Sep 2024, 08:07.
Comment
Sa Fe

Join Date: Jun 2020

Posts: 93
#3

08 Sep 2024, 21:48

Hi Jeff Pitblado (StataCorp)! Do you have any clue or potential solution for how I can make the changes happen in my dtable? Any insights would be appreciated. Thanks.
Comment

Jeff Pitblado (StataCorp)

StataCorp Employee

Join Date: Mar 2014
Posts: 700

09 Sep 2024, 16:24

Without a working example with data, it is very difficult to ensure any advice we give works.

Here is what I came up with, starting with the auto data and adding some indicator variables for prod1 and prod2.

Code:

clear all

set seed 18
sysuse auto
gen prod1 = runiformint(0,1)
label var prod1 "Prod 1"
gen prod2 = runiformint(0,1)
label var prod2 "Prod 2"
label define noyes 0 "no" 1 "yes"
label values prod1 prod2 noyes
svyset _n [pw=weight]

dtable 1.prod1 1.prod2, ///
    by(foreign) ///
    svy ///
    nosample ///
    factor(, statistics(fvfrequency fvrawfrequency fvpercent))

collect style header foreign, title(hide) level(label)
collect style header prod1 prod2, title(label) level(label)

unab fvars : prod?

// see the levels of dimension foreign
collect levels foreign
// .m is for the Overall/Total sample
local bylevels .m 0 1
// change position and label of .m
collect style autolevels foreign .m 0 1, clear
collect label levels foreign .m "Overall", modify

// collect CI for percents, overall and subpops
foreach by of local bylevels {
    if "`by'" != ".m" {
        local subpop subpop(if foreign==`by')
    }
    foreach x of local fvars {
        collect, tag(foreign[`by']): ///
            svy, `subpop': ///
                proportion 1.`x', percent
    }
}

// use unicode U+2800 braille pattern blank to create a blank space result
// where -collect- will respect the width and number of spaces -- here I'm
// using the character 3 times
collect get blank = "⠀⠀⠀", tags(colname[1.prod1])
collect style header result[blank], level(hide) 

collect label levels prod1 1 "(yes)", modify
collect label levels prod2 1 "(yes)", modify

collect composite define CI_tot = _r_lb _r_ub, trim delimiter("-")
collect style cell result[CI_tot], sformat("(%s)") nformat(%5.1f)
collect style cell result[fvpercent], nformat(%5.1f) sformat("%s")

collect composite define col1 = fvpercent CI_tot
collect composite define col2 = fvrawfrequency
collect composite define col3 = fvfrequency

collect label levels result ///
    col1 "Weighted % (95% CI)" ///
    col2 "Unweighted N" ///
    col3 "Weighted N" ///
    , modify
collect style header result[col1 col2 col3], title(hide) level(label)

collect style row stack, binder(" ") // regular ASCII space

collect style autolevels result col1 col2 col3, clear

collect style cell cell_type[column-header]#foreign, border(bottom)

collect layout (colname) ///
    (foreign[.m]#result result[blank] ///
     foreign[0]#result result[blank] ///
     foreign[1]#result)

// bold all the column headers
collect style cell cell_type[column-header], font(, bold)

// add hidden level to -colname- so we can bold its title
collect get blank = "⠀", tags(colname[_hide])
collect levels colname
collect style autolevels colname _hide `s(levels)', clear
collect label dim colname "Products", modify
collect style header colname, title(label)
collect style cell cell_type[row-header]#colname[_hide], font(, bold)

collect preview

I exported to HTML so we can see the bolding.

Click image for larger version

Name: Screenshot 2024-09-09 at 5.21.54 PM.png
Views: 1
Size: 39.3 KB
ID: 1763412

Comment

Sa Fe

Join Date: Jun 2020

Posts: 93
#5

12 Sep 2024, 07:55

Thanks Jeff Pitblado (StataCorp)! That's exactly what I wanted. I appreciate it.
Comment

Kevin Blaine

Join Date: Jul 2017
Posts: 66

15 Apr 2025, 18:56

Hi Jeff Pitblado (StataCorp) -- reviving this thread for some specific guidance. Trying to insert a blank column like above, but this time for etable estimate comparisons. Here's some code to get us to the same starting place, and fair warning, it's lifted from a command I'm writing, so I had to tweak a few things to get it to work in situ here:

Code:

clear all
collect clear
sysuse auto

collect clear
    
**    MODEL BUILDING
    capture est sto clear
    local m = 1
    
    foreach var in "i.rep78" "i.rep78 turn" {
                
        capture confirm existence `var'
        
        if !_rc {
            
            logistic foreign `var'
            est sto m`m'
            capture: lrtest m1
            scalar lrtest = r(p)
            collect r(area): lroc, nograph
            etable, varlabel mstat(aic) mstat(bic) mstat(auc=r(area)) mstat(lrtest=lrtest) showstars append
            local ++m
        }
    }

**    FORMAT COEFICIENT
    collect label levels result _r_b "OR", modify
    
**    CREATE AND FORMAT COMPOSITE
    collect composite define CI = _r_lb _r_ub, delimiter("-") trim
    collect style cell result[CI], nformat(%5.2f) sformat("[%s]")
    collect composite define or_CI = _r_b CI, delimiter(" ") trim
    collect composite define or_CI_star = or_CI stars, delimiter() trim override
    collect label levels result or_CI_star "OR [95% CI]", modify
    collect style cell result[or_CI_star], halign(center)

**    ADD MODEL COMPARISON STATS

    **    ADD TAGS
        collect addtags extra[aic], fortags(result[aic])
        collect addtags extra[bic], fortags(result[bic])
        collect addtags extra[auc], fortags(result[auc])
        collect addtags extra[lrtest], fortags(result[lrtest])

    **    LABEL LEVELS
        collect label levels extra aic "AIC" bic "BIC" auc "AUC" lrtest "LR Test", modify

    **    ALIGN
        collect recode result aic=_r_b bic=_r_b auc=_r_b lrtest=_r_b

    **    DECIMAL PLACES
        collect style cell extra[auc lrtest], nformat(%5.3f)


        
**    WHOLE TABLE LAYOUT
    
    **    LAYOUT LOCALS
        collect levelsof colname
        local names = s(levels)
        local remove _cons c1 c2 c3 c4
        local names: list names-remove            
        local last = word("`names'", `=wordcount("`names'")')
        collect style cell colname[`last'], border(bottom, pattern(single))
        
    **    NOTES
        collect notes "* p<0.05 ** p<0.01 *** p<0.001"
        collect notes "Lower AIC/BIC indicates a better fit."
        collect notes "Higher AUC indicates the model better predicts expected values."
        collect notes "LR Test <0.05 indicates the added predictors in the model fit the data better than Model 1."

**    UPDATED LAYOUT
    collect layout (coleq#colname[`names'] extra) (etable_depvar#result[or_CI_star])
        
**    FORMAT STARS
    collect stars _r_p .05 "*" .01 "**" .001 "***", result attach(stars)
    collect style cell result[stars], halign(left)
    collect style header result[stars], level(hide)

**    FORMAT COEFICIENT
    collect style cell result[_r_b], nformat(%5.2f) halign(left)


**    STYLE AND LABEL MODELS

    **    ALIGN
        capture collect style cell etable_depvar[1], halign(left)
        capture collect style cell etable_depvar[3], halign(left)
        capture collect style cell etable_depvar[5], halign(left)
        capture collect style cell etable_depvar[7], halign(left)
    
    **    LABEL
        capture collect label levels etable_depvar 1 "Model 1", modify
        capture collect label levels etable_depvar 3 "Model 2", modify
        capture collect label levels etable_depvar 5 "Model 3", modify
        capture collect label levels etable_depvar 7 "Model 4", modify        

**    ADD MODEL COMPARISON STATS

    **    ADD TAGS
        collect addtags extra[aic], fortags(result[aic])
        collect addtags extra[bic], fortags(result[bic])
        collect label levels extra aic "AIC" bic "BIC", modify
        collect style header extra result[_r_b], level(label)
        collect recode result aic=_r_b bic=_r_b

        collect style cell extra[aic bic auc lrtest], halign(center)


        
**    WHOLE TABLE LAYOUT
    
    **    ADD FONT AND BORDERS

        collect style cell, font(,size(8))
        collect style cell border_block, border(right, pattern(nil))
    
    **    TITLE
        collect title "`title'"
        collect style title, font(,bold)
    
    **    NOTES
        collect style notes, font(Garamond, size(7) italic)
        
    **    MISC
        collect style cell cell_type[column-header], halign(center)
        collect style cell cell_type[row-header], halign(left)
        collect style showbase all
        
**    FINAL LAYOUT
    collect style putdocx, layout (autofitc) halign(center)

collect preview

I've tried to incorporate your collect get blank = " ", tags(colname[1.rep78]), as well as modifying the collect layout statement, but to no avail. Any ideas?

Really, my existing solution works well, but I just wanted to create more separation between the different models, given that some will have stars attached, and upon quick look, the stars from a Model 1 may look like a prefix to the OR of Model 2 (and so on).

Thanks!

- Kevin

Comment

Jeff Pitblado (StataCorp)

StataCorp Employee

Join Date: Mar 2014
Posts: 700

16 Apr 2025, 10:51

Thanks for the working example, that helps immensely.

Here is the code snippet I added right after your call to collect layout. In this code I add a blank result tagged with extra[aic], which appears in the original layout's row specficiation, and for each existing level of etable_depvar, which appears in the original layout's column specification. Then a just need to add the blank result to the layout.

Code:

* add blank result to place between columns
collect levels etable_depvar
local levs = s(levels)
local first first
foreach l of local levs {
    if "`first'" == "first" {
        local first
        continue
    }
    * unicode character u2800 -- braille pattern blank
    * I'm adding 3 blanks to exaggerate the gap between columns
    collect get blank = "⠀⠀⠀", tags(extra[aic] etable_depvar[`l'])
}
collect style header result[blank], level(hide)

**    UPDATED LAYOUT with blanks
collect layout (coleq#colname[`names'] extra) (etable_depvar#result[blank or_CI_star])

Here is the resulting table.

Code:

. collect preview

---------------------------------------------------------------
                          Model 1               Model 2        
                        OR [95% CI]             OR [95% CI]    
---------------------------------------------------------------
Repair record 1978                                             
  3                0.02 [0.00-0.17] ***     0.08 [0.01-0.66] * 
  4                0.22 [0.04-1.33]         0.65 [0.08-5.33]   
Turn circle (ft.)                           0.61 [0.45-0.82] **
---------------------------------------------------------------
AIC                        60.89        ⠀⠀⠀        43.11       
BIC                        67.12                   51.42       
AUC                        0.823                   0.934       
LR Test                      .                     0.000       
---------------------------------------------------------------
* p<0.05 ** p<0.01 *** p<0.001
Lower AIC/BIC indicates a better fit.
Higher AUC indicates the model better predicts expected values.
LR Test <0.05 indicates the added predictors in the model fit the data better than Model 1.

Comment

Kevin Blaine

Join Date: Jul 2017

Posts: 66
#8

16 Apr 2025, 12:34

Beautiful. You work magic (as usual).

One last thing: Is there a way to put the "OR [95% CI]" label, which is for the or_CI_star result, directly above the dashed line and "Repair record 1978", appearing just once? I've toyed with putting it in the notes, but it doesn't look great visually, and I want to reduce repeating header clutter. No worries if not, and I'm also open to other suggestions!

- Kevin
Comment
Jeff Pitblado (StataCorp)

StataCorp Employee

Join Date: Mar 2014

Posts: 700
#9

16 Apr 2025, 13:19

I don't think it is possible to trick collect to center a single copy of "OR [95% CI]" under the levels of etable_depvar.

You could center it above the levels of etable_depvar, but then you'd have to use a different tagging scheme for the result[blank] items.
Comment
Kevin Blaine

Join Date: Jul 2017

Posts: 66
#10

23 Apr 2025, 11:45

Ah, makes sense. I had a feeling it wouldn't work. Thank you though!
Comment

Announcement

optimizing the dtable

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment