Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • optimizing the dtable

    I got the following code for generating a table of descriptive statistics. It has some downfalls that I want to address.


    Code:
    ....
    collect composite define blank = fvpercent, trim // this is just for adding an empty column between main columns.
    ...
    
    foreach x of local fvars {
        collect: svy: proportion `x', percent
    }
    collect composite define CI_tot = _r_lb _r_ub, trim delimiter("-")
    collect style cell result[CI_tot], sformat("(%s)") nformat(%2.1fc)
    collect style cell result[fvpercent], nformat(%2.1fc) sformat("%s")
    
    collect composite define col5 = fvpercent CI_tot
    collect composite define col6 = fvrawfrequency
    collect composite define col7 = fvfrequency
    
    collect style autolevels result col1 col2 col3 col5 col6 col7, clear
    collect label levels result ///
        col1 "Weighted % (95% CI)" ///
        col2 "Unweighted N" ///
        col3 "Weighted N" ///
        col5 "Weighted % (95% CI)" ///
        col6 "Unweighted N" ///
        col7 "Weighted N" ///
        , modify
    collect style header result [col1 col2 col3 blank col5 col6 col7 ], title(hide) level(label)
    collect layout (colname) (result[col5 col6 col7 blank] school_level#result[col1 col2 col3 blank] )
    
    collect style cell border_block, border(right, pattern(nil)) // remove vertical line
    collect style cell cell_type[column-header], font(, bold) // bold all the column headers
    collect style row stack, spacer nobinder
    collect preview

    This produces the following table.

    Click image for larger version

Name:	Screenshot 2024-09-05 at 3.20.47 PM.png
Views:	1
Size:	43.8 KB
ID:	1763141



    What I actually looking to generate is the following table:

    Click image for larger version

Name:	Screenshot 2024-09-05 at 3.24.19 PM.png
Views:	1
Size:	31.6 KB
ID:	1763142


    My question is how should I revise my code to be able to generate the latter table?

  • #2
    Hi statalist,

    Any idea on this? I added some annotation for a better clarity on the differences between the current code's output and what I'm actually looking to produce. Your help would be greatly appreciated.
    Click image for larger version

Name:	Screenshot 2024-09-05 at 3.24.19 PM copy.png
Views:	1
Size:	45.7 KB
ID:	1763207

    Last edited by Sa Fe; 06 Sep 2024, 08:07.

    Comment


    • #3
      Hi Jeff Pitblado (StataCorp)! Do you have any clue or potential solution for how I can make the changes happen in my dtable? Any insights would be appreciated. Thanks.

      Comment


      • #4
        Without a working example with data, it is very difficult to ensure any advice we give works.

        Here is what I came up with, starting with the auto data and adding some indicator variables for prod1 and prod2.
        Code:
        clear all
        
        set seed 18
        sysuse auto
        gen prod1 = runiformint(0,1)
        label var prod1 "Prod 1"
        gen prod2 = runiformint(0,1)
        label var prod2 "Prod 2"
        label define noyes 0 "no" 1 "yes"
        label values prod1 prod2 noyes
        svyset _n [pw=weight]
        
        dtable 1.prod1 1.prod2, ///
            by(foreign) ///
            svy ///
            nosample ///
            factor(, statistics(fvfrequency fvrawfrequency fvpercent))
        
        collect style header foreign, title(hide) level(label)
        collect style header prod1 prod2, title(label) level(label)
        
        unab fvars : prod?
        
        // see the levels of dimension foreign
        collect levels foreign
        // .m is for the Overall/Total sample
        local bylevels .m 0 1
        // change position and label of .m
        collect style autolevels foreign .m 0 1, clear
        collect label levels foreign .m "Overall", modify
        
        // collect CI for percents, overall and subpops
        foreach by of local bylevels {
            if "`by'" != ".m" {
                local subpop subpop(if foreign==`by')
            }
            foreach x of local fvars {
                collect, tag(foreign[`by']): ///
                    svy, `subpop': ///
                        proportion 1.`x', percent
            }
        }
        
        // use unicode U+2800 braille pattern blank to create a blank space result
        // where -collect- will respect the width and number of spaces -- here I'm
        // using the character 3 times
        collect get blank = "⠀⠀⠀", tags(colname[1.prod1])
        collect style header result[blank], level(hide) 
        
        collect label levels prod1 1 "(yes)", modify
        collect label levels prod2 1 "(yes)", modify
        
        collect composite define CI_tot = _r_lb _r_ub, trim delimiter("-")
        collect style cell result[CI_tot], sformat("(%s)") nformat(%5.1f)
        collect style cell result[fvpercent], nformat(%5.1f) sformat("%s")
        
        collect composite define col1 = fvpercent CI_tot
        collect composite define col2 = fvrawfrequency
        collect composite define col3 = fvfrequency
        
        collect label levels result ///
            col1 "Weighted % (95% CI)" ///
            col2 "Unweighted N" ///
            col3 "Weighted N" ///
            , modify
        collect style header result[col1 col2 col3], title(hide) level(label)
        
        collect style row stack, binder(" ") // regular ASCII space
        
        collect style autolevels result col1 col2 col3, clear
        
        collect style cell cell_type[column-header]#foreign, border(bottom)
        
        collect layout (colname) ///
            (foreign[.m]#result result[blank] ///
             foreign[0]#result result[blank] ///
             foreign[1]#result)
        
        // bold all the column headers
        collect style cell cell_type[column-header], font(, bold)
        
        // add hidden level to -colname- so we can bold its title
        collect get blank = "⠀", tags(colname[_hide])
        collect levels colname
        collect style autolevels colname _hide `s(levels)', clear
        collect label dim colname "Products", modify
        collect style header colname, title(label)
        collect style cell cell_type[row-header]#colname[_hide], font(, bold)
        
        collect preview
        I exported to HTML so we can see the bolding.


        Click image for larger version

Name:	Screenshot 2024-09-09 at 5.21.54 PM.png
Views:	1
Size:	39.3 KB
ID:	1763412

        Comment


        • #5
          Thanks Jeff Pitblado (StataCorp)! That's exactly what I wanted. I appreciate it.

          Comment


          • #6
            Hi Jeff Pitblado (StataCorp) -- reviving this thread for some specific guidance. Trying to insert a blank column like above, but this time for etable estimate comparisons. Here's some code to get us to the same starting place, and fair warning, it's lifted from a command I'm writing, so I had to tweak a few things to get it to work in situ here:

            Code:
            clear all
            collect clear
            sysuse auto
            
            collect clear
                
            **    MODEL BUILDING
                capture est sto clear
                local m = 1
                
                foreach var in "i.rep78" "i.rep78 turn" {
                            
                    capture confirm existence `var'
                    
                    if !_rc {
                        
                        logistic foreign `var'
                        est sto m`m'
                        capture: lrtest m1
                        scalar lrtest = r(p)
                        collect r(area): lroc, nograph
                        etable, varlabel mstat(aic) mstat(bic) mstat(auc=r(area)) mstat(lrtest=lrtest) showstars append
                        local ++m
                    }
                }
            
            **    FORMAT COEFICIENT
                collect label levels result _r_b "OR", modify
                
            **    CREATE AND FORMAT COMPOSITE
                collect composite define CI = _r_lb _r_ub, delimiter("-") trim
                collect style cell result[CI], nformat(%5.2f) sformat("[%s]")
                collect composite define or_CI = _r_b CI, delimiter(" ") trim
                collect composite define or_CI_star = or_CI stars, delimiter() trim override
                collect label levels result or_CI_star "OR [95% CI]", modify
                collect style cell result[or_CI_star], halign(center)
            
            **    ADD MODEL COMPARISON STATS
            
                **    ADD TAGS
                    collect addtags extra[aic], fortags(result[aic])
                    collect addtags extra[bic], fortags(result[bic])
                    collect addtags extra[auc], fortags(result[auc])
                    collect addtags extra[lrtest], fortags(result[lrtest])
            
                **    LABEL LEVELS
                    collect label levels extra aic "AIC" bic "BIC" auc "AUC" lrtest "LR Test", modify
            
                **    ALIGN
                    collect recode result aic=_r_b bic=_r_b auc=_r_b lrtest=_r_b
            
                **    DECIMAL PLACES
                    collect style cell extra[auc lrtest], nformat(%5.3f)
            
            
                    
            **    WHOLE TABLE LAYOUT
                
                **    LAYOUT LOCALS
                    collect levelsof colname
                    local names = s(levels)
                    local remove _cons c1 c2 c3 c4
                    local names: list names-remove            
                    local last = word("`names'", `=wordcount("`names'")')
                    collect style cell colname[`last'], border(bottom, pattern(single))
                    
                **    NOTES
                    collect notes "* p<0.05 ** p<0.01 *** p<0.001"
                    collect notes "Lower AIC/BIC indicates a better fit."
                    collect notes "Higher AUC indicates the model better predicts expected values."
                    collect notes "LR Test <0.05 indicates the added predictors in the model fit the data better than Model 1."
            
            **    UPDATED LAYOUT
                collect layout (coleq#colname[`names'] extra) (etable_depvar#result[or_CI_star])
                    
            **    FORMAT STARS
                collect stars _r_p .05 "*" .01 "**" .001 "***", result attach(stars)
                collect style cell result[stars], halign(left)
                collect style header result[stars], level(hide)
            
            **    FORMAT COEFICIENT
                collect style cell result[_r_b], nformat(%5.2f) halign(left)
            
            
            **    STYLE AND LABEL MODELS
            
                **    ALIGN
                    capture collect style cell etable_depvar[1], halign(left)
                    capture collect style cell etable_depvar[3], halign(left)
                    capture collect style cell etable_depvar[5], halign(left)
                    capture collect style cell etable_depvar[7], halign(left)
                
                **    LABEL
                    capture collect label levels etable_depvar 1 "Model 1", modify
                    capture collect label levels etable_depvar 3 "Model 2", modify
                    capture collect label levels etable_depvar 5 "Model 3", modify
                    capture collect label levels etable_depvar 7 "Model 4", modify        
            
            **    ADD MODEL COMPARISON STATS
            
                **    ADD TAGS
                    collect addtags extra[aic], fortags(result[aic])
                    collect addtags extra[bic], fortags(result[bic])
                    collect label levels extra aic "AIC" bic "BIC", modify
                    collect style header extra result[_r_b], level(label)
                    collect recode result aic=_r_b bic=_r_b
            
                    collect style cell extra[aic bic auc lrtest], halign(center)
            
            
                    
            **    WHOLE TABLE LAYOUT
                
                **    ADD FONT AND BORDERS
            
                    collect style cell, font(,size(8))
                    collect style cell border_block, border(right, pattern(nil))
                
                **    TITLE
                    collect title "`title'"
                    collect style title, font(,bold)
                
                **    NOTES
                    collect style notes, font(Garamond, size(7) italic)
                    
                **    MISC
                    collect style cell cell_type[column-header], halign(center)
                    collect style cell cell_type[row-header], halign(left)
                    collect style showbase all
                    
            **    FINAL LAYOUT
                collect style putdocx, layout (autofitc) halign(center)
            
            collect preview
            I've tried to incorporate your collect get blank = " ", tags(colname[1.rep78]), as well as modifying the collect layout statement, but to no avail. Any ideas?

            Really, my existing solution works well, but I just wanted to create more separation between the different models, given that some will have stars attached, and upon quick look, the stars from a Model 1 may look like a prefix to the OR of Model 2 (and so on).

            Thanks!

            - Kevin

            Comment


            • #7
              Thanks for the working example, that helps immensely.

              Here is the code snippet I added right after your call to collect layout. In this code I add a blank result tagged with extra[aic], which appears in the original layout's row specficiation, and for each existing level of etable_depvar, which appears in the original layout's column specification. Then a just need to add the blank result to the layout.
              Code:
              * add blank result to place between columns
              collect levels etable_depvar
              local levs = s(levels)
              local first first
              foreach l of local levs {
                  if "`first'" == "first" {
                      local first
                      continue
                  }
                  * unicode character u2800 -- braille pattern blank
                  * I'm adding 3 blanks to exaggerate the gap between columns
                  collect get blank = "⠀⠀⠀", tags(extra[aic] etable_depvar[`l'])
              }
              collect style header result[blank], level(hide)
              
              **    UPDATED LAYOUT with blanks
              collect layout (coleq#colname[`names'] extra) (etable_depvar#result[blank or_CI_star])
              Here is the resulting table.
              Code:
              . collect preview
              
              ---------------------------------------------------------------
                                        Model 1               Model 2        
                                      OR [95% CI]             OR [95% CI]    
              ---------------------------------------------------------------
              Repair record 1978                                             
                3                0.02 [0.00-0.17] ***     0.08 [0.01-0.66] * 
                4                0.22 [0.04-1.33]         0.65 [0.08-5.33]   
              Turn circle (ft.)                           0.61 [0.45-0.82] **
              ---------------------------------------------------------------
              AIC                        60.89        ⠀⠀⠀        43.11       
              BIC                        67.12                   51.42       
              AUC                        0.823                   0.934       
              LR Test                      .                     0.000       
              ---------------------------------------------------------------
              * p<0.05 ** p<0.01 *** p<0.001
              Lower AIC/BIC indicates a better fit.
              Higher AUC indicates the model better predicts expected values.
              LR Test <0.05 indicates the added predictors in the model fit the data better than Model 1.

              Comment


              • #8
                Beautiful. You work magic (as usual).

                One last thing: Is there a way to put the "OR [95% CI]" label, which is for the or_CI_star result, directly above the dashed line and "Repair record 1978", appearing just once? I've toyed with putting it in the notes, but it doesn't look great visually, and I want to reduce repeating header clutter. No worries if not, and I'm also open to other suggestions!

                - Kevin

                Comment


                • #9
                  I don't think it is possible to trick collect to center a single copy of "OR [95% CI]" under the levels of etable_depvar.

                  You could center it above the levels of etable_depvar, but then you'd have to use a different tagging scheme for the result[blank] items.

                  Comment


                  • #10
                    Ah, makes sense. I had a feeling it wouldn't work. Thank you though!

                    Comment

                    Working...
                    X