Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Remapping dtable's test with by using collect layout

    Hello,
    I have been setting up a custom descriptive statistics table for personal data use. Reading through the forums, I have been able to get the formatting close to my liking. However, I am having trouble remapping (?) the p-value of dichotomous variables, so that the value can be seen when excluding the 0.var level.

    The layout is set using (roweq#var#`depvar') to provide a vertical format. This is necessary, as a horizontal format becomes difficult to read and compare when the depvar has more than two levels.

    using example code:
    Code:
    clear all
    webuse nhanes2
    svyset 
    
    gen index = inrange(hlthstat,1,3)
        label var index "Index Subpopulation"
    
    *----------------------
    * macro for continuous variables 
    local convars age weight
    
    * factor variables, all
    local factorvars  diabetes rural race region
    
    * Dependent variable
    local depvar sex
    
    * variables specified in options -continuous()- and -factor()- do not
    * need to be specified in the varlist unless you want a special variable
    * order that is otherwise too difficult to get using the options
    dtable, by(`depvar', tests nomissing) svy subpop(index) column(by(hide))   ///
        continuous(`convars', statistics(total mean sd semean p50 p25 p75))    ///
        factor(`factorvars',                ///
            statistics(fvfrequency fvpercent)        ///
            test(svywald)                ///
        )                        
    
    *------------------------------------------------------------
    * Add -roweq- tag to nest vars into groups; -roweq- is a special
    * dimension that will grab variable labels for it's levels that match
    * variable names in the current frame
    foreach c of local convars {
        collect addtags roweq[`c'], fortags(var[`c'])
        * hide this factor variable's title since we plan to use this
        * -roweq- level to title this variable in the header
        collect style header var[`c'], title(hide) level(hide) // hides the unbolded title of the variable 
        local ccall `ccall' `c'
    }
    
    foreach var of varlist `factorvars' {
        levelsof `var'    // get levels of each variable
    
        * Dichotomous variables 
        if r(r) == 2 {
            collect addtags roweq[`var'], fortags(var[i.`var'])
            * hide this factor variable's title since we plan to use this
            * -roweq- level to title this variable in the header
            collect style header `var', title(hide) level(hide)
            local fcall `fcall' 1.`var'
        }
        *  All other factor variables
        if r(r) > 2 {
            collect addtags roweq[`var'], fortags(var[i.`var'])
            collect style header `var', title(hide) level(label)
            local fcall `fcall' i.`var'
        }
    }
    
    *-------------------------------
    * Shows the bolded title
    collect style header roweq, title(hide) level(label)
    * bold all the column headers
    collect style cell cell_type[column-header], font(arial, bold)
    * bold the levels of -roweq-
    collect style cell roweq#cell_type[row-header], font(arial, bold)
    * unbold the variable names/labels
    collect style cell var#cell_type[row-header], font(arial, nobold)
    
    collect style row stack, truncate(head)
    
    // Composite results
    *------------------------------- 
    * stack non-missing counts and factor level frequencies
    collect composite define col1 = total fvfrequency, trim
    collect label levels result col1 "Total"
    * stack means and factor level percentages
    collect composite define col2 = mean fvpercent, trim
    collect label levels result col2 "Mean/%"
    // * stack means and factor level percentages
    collect composite define col3 = regress svywald, trim
    collect label levels result col3 "Test"
    
    // Format
    *-------------------------------
    * Formatting: show custom label for results in the header
    collect label levels result total "Total (N)"  sd "SD" semean "SE Mean" p50 "p50" p25 "p25" p75 "p75", modify
    collect style header result, level(label)  title(hide)
    
    * Changing the format of result cells
    collect style cell result[total sd p50 p25 p75], nformat(%12.2gc)
    collect style cell result[mean fvpercent], nformat(%9.2fc)
    
    *------------------------------- 
    * Setting autolevels for ease of use
    collect style autolevels result, clear // removes existing levels for results
    collect style autolevels result col1 col2 sd seamean frequency  percent p50 p25 p75  col3 
    
    *------------------------------- 
    * Add notes to the bottom using collect
    collect notes 1: "Continuous variables tested via Regress"
    collect notes 2: "Factor variables tested via Adjusted Wald Test"
    
    
    // Layout 
    *------------------------------- 
    * uses only the present status (ie var == 1) of dichotomous variables, but missing p-value
    collect layout (roweq#var[`ccall'  `fcall']#`depvar') (result)
    
    * shows all, including absent status (var == 0) of dichotomous variables
    collect layout (roweq#var#`depvar') (result)
    
    
    * publish our table to MS Excel 
    collect export ztable1.xlsx, replace
    Code "Layout " shows the p-value in the first instance, but it is missing for dichotomous variables in the second instance when the "absent" level is removed.

    Ideally, the "Test" value would be aligned with the roweq level (bolded title), though setting it just below the last value of i.var would also work. I have tried the code below without success.
    Code:
    collect remap `depvar'[_dtable_test]=`depvar'[0.`var'] // causes the p-values to be lost on preview 
    collect remap sex[_dtable_test]=roweq // error, as already has tag at roweq
    collect remap sex[_dtable_test]=sex[.m] // only moves within the levels of sex, not by var/roweq

  • #2
    Thank you for providing a working example with data.

    Note that if you have any factor variables with 2 levels that are not 0 and 1, then the "Dichotomous variables" block is not going to work as intended. Here is one way you could change your code to make identifying dichotomous variables more specific.
    Code:
    collect levelsof `var'
    if `"`s(levels)'"' == "0 1" {
        ...
    }
    else {
        ...
    }
    In the following I've highlighted in blue my code changes to levelsof for identifying dichotomous variables and my code additions that put the p-values in line with the roweq headers.
    Code:
    clear all
    webuse nhanes2
    svyset
    
    gen index = inrange(hlthstat,1,3)
        label var index "Index Subpopulation"
    
    *----------------------
    * macro for continuous variables
    local convars age weight
    
    * factor variables, all
    local factorvars  diabetes rural race region
    
    * Dependent variable
    local depvar sex
    
    * variables specified in options -continuous()- and -factor()- do not
    * need to be specified in the varlist unless you want a special variable
    * order that is otherwise too difficult to get using the options
    dtable, by(`depvar', tests nomissing) svy subpop(index) column(by(hide))   ///
        continuous(`convars', statistics(total mean sd semean p50 p25 p75))    ///
        factor(`factorvars',                ///
            statistics(fvfrequency fvpercent)        ///
            test(svywald)                ///
        )
    
    *------------------------------------------------------------
    * Add -roweq- tag to nest vars into groups; -roweq- is a special
    * dimension that will grab variable labels for it's levels that match
    * variable names in the current frame
    foreach c of local convars {
        collect addtags roweq[`c'], fortags(var[`c'])
        * hide this factor variable's title since we plan to use this
        * -roweq- level to title this variable in the header
        collect style header var[`c'], title(hide) level(hide) // hides the unbolded title of the variable
        local ccall `ccall' `c'
    }
    
    foreach var of varlist `factorvars' {
        collect levelsof `var'    // get levels of each variable
    
        * Dichotomous variables
        if `"`s(levels)'"' == "0 1" {
            collect addtags roweq[`var'], fortags(var[i.`var'])
            * hide this factor variable's title since we plan to use this
            * -roweq- level to title this variable in the header
            collect style header `var', title(hide) level(hide)
            * inject a hidden level for this variable's test
            collect remap var[0.`var'] = var[_h_`var'], ///
                    fortags(var[0.`var']#result[svywald])
            collect style header var[_h_`var'], level(hide)
            local fcall `fcall' _h_`var' 1.`var'
        }
        else {
            *  All other factor variables
            collect addtags roweq[`var'], fortags(var[i.`var'])
            collect style header `var', title(hide) level(label)
            * inject a hidden level for this variable's test
            collect levels `var'
            local levels = s(levels)
            gettoken first : levels
            collect remap var[`first'.`var'] = var[_h_`var'], ///
                    fortags(var[`first'.`var']#result[svywald])
            collect style header var[_h_`var'], level(hide)
            local fcall `fcall' _h_`var' i.`var'
        }
    }
    
    *-------------------------------
    * Shows the bolded title
    collect style header roweq, title(hide) level(label)
    * bold all the column headers
    collect style cell cell_type[column-header], font(arial, bold)
    * bold the levels of -roweq-
    collect style cell roweq#cell_type[row-header], font(arial, bold)
    * unbold the variable names/labels
    collect style cell var#cell_type[row-header], font(arial, nobold)
    
    collect style row stack, truncate(head)
    
    // Composite results
    *-------------------------------
    * stack non-missing counts and factor level frequencies
    collect composite define col1 = total fvfrequency, trim
    collect label levels result col1 "Total"
    * stack means and factor level percentages
    collect composite define col2 = mean fvpercent, trim
    collect label levels result col2 "Mean/%"
    // * stack means and factor level percentages
    collect composite define col3 = regress svywald, trim
    collect label levels result col3 "Test"
    
    // Format
    *-------------------------------
    * Formatting: show custom label for results in the header
    collect label levels result total "Total (N)"  sd "SD" semean "SE Mean" p50 "p50" p25 "p25" p75 "p75", modify
    collect style header result, level(label)  title(hide)
    
    * Changing the format of result cells
    collect style cell result[total sd p50 p25 p75], nformat(%12.2gc)
    collect style cell result[mean fvpercent], nformat(%9.2fc)
    
    *-------------------------------
    * Setting autolevels for ease of use
    collect style autolevels result, clear // removes existing levels for results
    collect style autolevels result col1 col2 sd seamean frequency  percent p50 p25 p75  col3
    
    *-------------------------------
    * Add notes to the bottom using collect
    collect notes 1: "Continuous variables tested via Regress"
    collect notes 2: "Factor variables tested via Adjusted Wald Test"
    
    
    // Layout
    *-------------------------------
    * uses only the present status (ie var == 1) of dichotomous variables, but missing p-value
    collect layout (roweq#var[`ccall'  `fcall']#`depvar') (result)
    
    * make `depvar'[_dtable_test] show up first, then hide its label
    collect query autolevels `depvar'
    collect style autolevels `depvar' _dtable_test `s(levels)', clear
    collect style header `depvar'[_dtable_test], level(hide)
    
    collect preview
    Here is the resulting table.
    Code:
    -----------------------------------------------------------
                       Total    Mean/%   SD  p50 p25 p75  Test
    -----------------------------------------------------------
    Age (years)                                           0.105
      Male          1888636417    39.91 (14)  37  27  51
      Female        2016337020    40.52 (15)  37  27  52
      Total         3904973437    40.22 (14)  37  27  52
    Weight (kg)                                          <0.001
      Male          3727866905    78.78 (12)  78  70  86
      Female        3225742206    64.83 (13)  62  55  71
      Total         6953609111    71.63 (15)  70  60  81
    Diabetes status                                       0.042
        Male           774,607  (1.64%)
        Female       1,087,465  (2.19%)
        Total        1,862,072  (1.92%)
    Rural                                                 0.001
        Male        15,194,101 (32.11%)
        Female      14,145,717 (28.43%)
        Total       29,339,818 (30.22%)
    Race                                                  0.847
      White
        Male        42,175,598 (89.13%)
        Female      44,498,965 (89.43%)
        Total       86,674,563 (89.28%)
      Black
        Male         3,814,338  (8.06%)
        Female       4,054,595  (8.15%)
        Total        7,868,933  (8.11%)
      Other
        Male         1,330,756  (2.81%)
        Female       1,204,788  (2.42%)
        Total        2,535,544  (2.61%)
    Region                                                0.552
      NE
        Male        10,617,461 (22.44%)
        Female      10,559,571 (21.22%)
        Total       21,177,032 (21.81%)
      MW
        Male        11,824,970 (24.99%)
        Female      12,794,116 (25.71%)
        Total       24,619,086 (25.36%)
      S
        Male        11,574,661 (24.46%)
        Female      12,634,377 (25.39%)
        Total       24,209,038 (24.94%)
      W
        Male        13,303,600 (28.11%)
        Female      13,770,284 (27.67%)
        Total       27,073,884 (27.89%)
    -----------------------------------------------------------
    Continuous variables tested via Regress
    Factor variables tested via Adjusted Wald Test

    Comment


    • #3
      Thank you so much! That solution works perfectly.

      Comment

      Working...
      X