Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • dtable, formatting problem

    I am trying to use dtable to produce a table but I am unable to get the values in the columns to align:
    ----------------------- copy starting from the next line -----------------------
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str2 Case double(a1 a2 a3) byte digital double(d1 d2 d3)
    "1"  4572 13461 18033  1 4433  6111 10544
    "2"  4613 18754 23367  8 4459  6030 10489
    "3"  4527 19667 24194  9 4437  6922 11359
    "4"  4342 18756 23098 10 4759  6808 11567
    "5"  2085  7154  9239 11 4419  7059 11478
    "6"  4155 17563 21718 12 4839  9106 13945
    "7"  4760 20071 24831 13 4404  6508 10912
    "8"  4530 18416 22946 14 4610 10274 14884
    "9"  4777 17825 22602 15 4910  1536  6446
    "10" 2090  6764  8854  2 4915  4887  9802
    "11" 4175 19312 23487  3 5037  2313  7350
    "12" 4388 18059 22447  4 5286  8900 14186
    "13" 2231  6858  9089  5 4683  7610 12293
    "14" 2178  6112  8290  6 5255  5560 10815
    "15" 2197  6559  8756  7 4616  7431 12047
    end
     dtable a1-a3 d1-d3, continuous(a1-a3 d1-d3, statistics( mean sd min max))       ///
    column( summary(Summary Statistics of time (A/D)) ) nformat(%7.0f)
    collect notes "Mean, (SD), Min, Max. Times rounded to seconds"
     collect preview
    
    ------------------------------------------------
                    Summary Statistics of time (A/D)
    ------------------------------------------------
    N                                             15
    Fabrication / s            3708 (1150) 2085 4777
    Ancillary / s            14355 (5806) 6112 20071
    Total / s                18063 (6911) 8290 24831
    Fabrication / s             4737 (297) 4404 5286
    Ancillary / s             6470 (2322) 1536 10274
    Total / s                11208 (2279) 6446 14884
    ------------------------------------------------
    Mean, (SD), Min, Max. Times rounded to seconds
    There seems to be similar behaviour in the tables in the blog The Stata Blog ยป Creating tables of descriptive statistics in Stata 18: The new dtable command.

    Is it possible to align the data in the columns?
    Thank you.
    Julie

  • #2

    dtable uses composite results to facilitate stacking
    variable-specific results.

    If you use collect layout instead of collect preview you
    will see what dimensions are used to construct the table.

    Code:
    . collect layout
    
    Collection: DTable
          Rows: var
       Columns: result
       Table 1: 7 x 1
    
    ------------------------------------------------
                    Summary Statistics of time (A/D)
    ------------------------------------------------
    N                                             15
    Fabrication / s            3708 (1150) 2085 4777
    Ancillary / s            14355 (5806) 6112 20071
    Total / s                18063 (6911) 8290 24831
    Fabrication / s             4737 (297) 4404 5286
    Ancillary / s             6470 (2322) 1536 10274
    Total / s                11208 (2279) 6446 14884
    ------------------------------------------------
    Mean, (SD), Min, Max. Times rounded to seconds
    You can query the autolevels for dimension result to see it is
    using a composite result named _dtable_stats.

    Code:
    . collect query autolevels result
    
    Automatic dimension levels
    Collection: DTable
     Dimension: result
        Levels: _dtable_stats
    If you query the composite definition of _dtable_stats you get

    Code:
    . collect query composite _dtable_stats
    
    Composite definition
    Collection: DTable
     Composite: _dtable_stats
      Elements: frequency
                mean
                sd
                min
                max
     Delimiter: " "
          Trim: on
      Override: off
    You can define your own composite results and redefine the result
    autolevels to organize your results into separate columns. You only
    need one composite result to allow you to stack the N frequency
    above the max values.

    Code:
    collect composite define col4 = frequency max
    Here is how you redefine the autolevels of result so your results
    of interest are composed in separate columns.

    Code:
    collect style autolevels result mean sd min col4, clear
    The only thing left is to fix the column header. In the prior layout, the
    column was composed from result[_dtable_stats], and dtable
    attached your custom column title as a label to _dtable_stats. In our
    new layout, we want to hide the labels of the individual results and use your
    custom label. Here is how I did this.

    Code:
    collect label dim result `"Summary Statistics of time (A/D)"', modify
    collect style header result, title(label) level(hide)
    Here is the resulting table

    Code:
    . collect preview
    
    ------------------------------------------------
                    Summary Statistics of time (A/D)
    ------------------------------------------------
    N                                             15
    Fabrication / s     3708   (1150)   2085    4777
    Ancillary / s      14355   (5806)   6112   20071
    Total / s          18063   (6911)   8290   24831
    Fabrication / s     4737    (297)   4404    5286
    Ancillary / s       6470   (2322)   1536   10274
    Total / s          11208   (2279)   6446   14884
    ------------------------------------------------
    Mean, (SD), Min, Max. Times rounded to seconds
    BTW, I added variable labels in my copy of the original example so my row
    titles look like the above table. Here is the do-file I ran to get the above
    table.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str2 Case double(a1 a2 a3) byte digital double(d1 d2 d3)
    "1"  4572 13461 18033  1 4433  6111 10544
    "2"  4613 18754 23367  8 4459  6030 10489
    "3"  4527 19667 24194  9 4437  6922 11359
    "4"  4342 18756 23098 10 4759  6808 11567
    "5"  2085  7154  9239 11 4419  7059 11478
    "6"  4155 17563 21718 12 4839  9106 13945
    "7"  4760 20071 24831 13 4404  6508 10912
    "8"  4530 18416 22946 14 4610 10274 14884
    "9"  4777 17825 22602 15 4910  1536  6446
    "10" 2090  6764  8854  2 4915  4887  9802
    "11" 4175 19312 23487  3 5037  2313  7350
    "12" 4388 18059 22447  4 5286  8900 14186
    "13" 2231  6858  9089  5 4683  7610 12293
    "14" 2178  6112  8290  6 5255  5560 10815
    "15" 2197  6559  8756  7 4616  7431 12047
    end
    
    label variable a1 "Fabrication / s"
    label variable a2 "Ancillary / s"
    label variable a3 "Total / s"
    label variable d1 "Fabrication / s"
    label variable d2 "Ancillary / s"
    label variable d3 "Total / s"
    
    dtable a1-a3 d1-d3, ///
        continuous(a1-a3 d1-d3, statistics( mean sd min max)) ///
        column( summary(Summary Statistics of time (A/D)) ) ///
        nformat(%7.0f)
    collect notes "Mean, (SD), Min, Max. Times rounded to seconds"
    
    collect layout
    collect query autolevels result
    collect query composite _dtable_stats
    collect query header result[_dtable_stats]
    
    collect composite define col4 = frequency max
    collect style autolevels result mean sd min col4, clear
    collect label dim result `"Summary Statistics of time (A/D)"', modify
    collect style header result, title(label) level(hide)
    collect preview

    Comment


    • #3

      Thank you Jeff for your clear explanation. I had not apprciated the information that collect layout would give me
      Julie

      Comment


      • #4
        Hi Jeff Pitblado (StataCorp) -- How would I modify the above code to remove (vs add) a column from my dtable? For example, the code below (which you wrote and I modified!) adds a star next to the P-Value. How would I remove the p-value itself, leaving only the stars?

        Code:
            collect query autolevels result
            * the p-values are in composite _dtable_test: pearson and regress
            collect query composite _dtable_test
            * define stars rules using the p-value results
            collect stars pearson fisher regress .01 "*" .001 "**" .0001 "***"
            * add stars to the result autolevels
            collect style autolevels result stars
            collect style cell result[stars], halign(left)
        Thanks!

        Comment


        • #5
          Hi Kevin,

          Your post seems out of place in this thread. Julie's example does not use dtable's by() option. So we need a new example using by() to make sense of your question. Here is a simple example using the auto data. Note the pieces of code highlighted in blue.
          Code:
          sysuse auto
          gen high = price > 9999
          dtable mpg turn i.for i.rep, by(high, test)
          
          collect query autolevels result
          * the p-values are in composite _dtable_test: pearson and regress
          collect query composite _dtable_test
          * define stars rules using the p-value results
          collect stars pearson fisher regress .1 "*" .01 "**" .001 "***", dimension
          * -collect stars- above added a new dimension, attaching -stars[value]-
          * to every existing item, and adding new items for the stars labels
          * tagged with -stars[label]-; but we don't need to see it's label or
          * levels in the header
          collect style header stars, title(hide) level(hide)
          * change the layout to show the stars instead of the p-values
          collect layout (var) (high#result[_dtable_stats]#stars[value] high#stars[label])
          Here is the resulting table.
          Code:
          --------------------------------------------------------------------
                                                    high                      
                                    0              1            Total     Test
          --------------------------------------------------------------------
          N                      64 (86.5%)     10 (13.5%)    74 (100.0%)    
          Mileage (mpg)      22.281 (5.522) 15.000 (2.667) 21.297 (5.786) ***
          Turn circle (ft.)  39.078 (4.180) 43.300 (4.191) 39.649 (4.399) **  
          Car origin                                                          
            Domestic             44 (68.8%)      8 (80.0%)     52 (70.3%)    
            Foreign              20 (31.2%)      2 (20.0%)     22 (29.7%)    
          Repair record 1978                                                  
            1                      2 (3.3%)       0 (0.0%)       2 (2.9%)    
            2                     7 (11.7%)      1 (11.1%)      8 (11.6%)    
            3                    23 (38.3%)      7 (77.8%)     30 (43.5%)    
            4                    18 (30.0%)       0 (0.0%)     18 (26.1%)    
            5                    10 (16.7%)      1 (11.1%)     11 (15.9%)    
          --------------------------------------------------------------------
          Last edited by Jeff Pitblado (StataCorp); 22 Jul 2024, 19:40.

          Comment


          • #6
            This is excellent. I'm looking forward to trialing out on some code tomorrow. Many thanks!

            Comment

            Working...
            X