dtable kendall BUG

Jonathan Afilalo

Join Date: Nov 2016

Posts: 41
#1

dtable kendall BUG

08 May 2025, 09:28

Hi, I believe there is a bug in dtable whereby ftest(kendall) returns Kendall's tau-b instead of the Prob > |z| (i.e. p-value). All of the other tests return p-values not coefficients.

Code:

dtable, by(ambulance_yesno) factor(triage_code, statistics(fvfrequency fvpercent) test(kendall))

Returns a test value of -0.10.

Code:

ktau ambulance_yesno triage_code

Returns a Kendall's tau-b of -0.10 and a p-value of 0.6047.

Can this bug be fixed so that kendall within dtable returns the p-value?
Tags: None

Jeff Pitblado (StataCorp)

StataCorp Employee

Join Date: Mar 2014
Posts: 698

08 May 2025, 16:54

dtable documents that it reports Kendall's τ_b, not a p-value.
This test statistic corresponds with the value reported by tabulate with option taub.

We will consider updating dtable to support a new ftest name for the p-value that corresponds with Kendall's τ_b.

In the mean time, if you want the p-values from ktau in your tables, you can add them to your collection and update the collection to show your p-value instead of Kendall's τ_b.

Here is an example of how you can do this.

Code:

sysuse auto

dtable, by(foreign, test) ///
    factor(rep78, statistics(fvfrequency fvpercent) test(kendall))

* show what dimensions are used in the layout
collect layout
* see all levels of the -by()- variable/dimension;
* note that '_dtable_test' identifies the column labeled "Test"
collect levels foreign
* see autolevels of the result dimension;
* _dtable_stats -- are the continuous and factor variable descriptive stats
* _dtable_test -- are the continuous and factor variable test stats
collect query autolevels result
* keep a copy of these levels, we will want to restore them after
* collecting the p-values
local auto_results = s(levels)
* see definition of composite result for tests;
* we are going to redefine this to use out p-value result instead of 'kendall'
collect query composite _dtable_test

* compute p-value
ktau foreign rep78, stats(taub)
* collect p-value, make sure to tag it properly
collect get p_taub=(r(p)), tags(foreign[_dtable_test] var[1.rep78])

* restore the result autolevels;
* -collect get- added p_taub to the autolevels, but we want it to be
* referenced in place of -kendall- within composite result
* '_dtable_test' 
collect style autolevels result `auto_results', clear

* style new p-value as you like
collect style cell result[p_taub], nformat(%6.3f) minimum(.001)

* redefine composite result using new p-value result
collect composite define _dtable_test = p_taub, trim replace

* replay layout
collect layout

Here is the resulting table.

Code:

-----------------------------------------------------------
                                  Car origin
                    Domestic    Foreign     Total     Test
-----------------------------------------------------------
N                  52 (70.3%) 22 (29.7%) 74 (100.0%)
Repair record 1978
  1                  2 (4.2%)   0 (0.0%)    2 (2.9%) <0.001
  2                 8 (16.7%)   0 (0.0%)   8 (11.6%)
  3                27 (56.2%)  3 (14.3%)  30 (43.5%)
  4                 9 (18.8%)  9 (42.9%)  18 (26.1%)
  5                  2 (4.2%)  9 (42.9%)  11 (15.9%)
-----------------------------------------------------------

Comment

Jonathan Afilalo

Join Date: Nov 2016

Posts: 41
#3

13 May 2025, 06:54

Thanks Jeff, the code you shared works, but it breaks down when my dtable contains multiple factor vars or a mix of factor vars and continuous vars. I tried modifying as such but unsuccessful:

Code:

collect levels `colvar' collect query autolevels result local autoresults = s(levels) collect query composite _dtable_test foreach facvar of local fvars{ ktau `colvar' `facvar', stats(taub) collect get p_taub=(r(p)), tags(`colvar'[_dtable_test] var[1.`facvar']) } collect style autolevels result `autoresults', clear collect style cell result[p_taub], nformat(%6.3f) minimum(.001) collect composite define _dtable_test = p_taub, trim replace collect layout

It also breaks down when my dtable structure is ever so slightly altered, for example:

Code:

sample(N, statistics(frequency) place(seplabels))

does not display the p-value after your code, whereas

Code:

sample(N, statistics(frequency) place(inlabels))

works well.

Is there a more robust solution, or better yet, consider this a plea for updating dtable sooner than later to natively report Kendall's p-value (which is truly the expected statistic here, I worry that many people will miss this nuance and falsely report the Kendall tau value in their papers instead of the p-value).
Comment

Jeff Pitblado (StataCorp)

StataCorp Employee

Join Date: Mar 2014
Posts: 698

13 May 2025, 08:58

How does it break down? Did you keep the regress result when you redefined composite result _dtable_test?

Here is an example where I add 2 more factor variables and some continuous variables. I highlight my edits to the original example.

Code:

sysuse auto

gen odd = mod(_n,2)
gen mod3 = mod(_n,3)

unab fvars : rep odd mod
unab colvar : for

dtable mpg turn trunk, by(`colvar', test) ///
    factor(`fvars', statistics(fvfrequency fvpercent) test(kendall))

* show what dimensions are used in the layout
collect layout
* see all levels of the -by()- variable/dimension;
* note that '_dtable_test' identifies the column labeled "Test"
collect levels `colvar'
* see autolevels of the result dimension;
* _dtable_stats -- are the continuous and factor variable descriptive stats
* _dtable_test -- are the continuous and factor variable test stats
collect query autolevels result
* keep a copy of these levels, we will want to restore them after
* collecting the p-values
local auto_results = s(levels)
* see definition of composite result for tests;
* we are going to redefine this to use out p-value result instead of 'kendall'
collect query composite _dtable_test

foreach facvar of local fvars {
    * get first level of facvar
    collect levels `facvar'
    local levels `"`s(levels)'"'
    gettoken first : levels
    * compute p-value
    ktau `colvar' `facvar', stats(taub)
    * collect p-value, make sure to tag it properly
    collect get p_taub=(r(p)), ///
        tags(`colvar'[_dtable_test] var[`first'.`facvar'])
}

* restore the result autolevels;
* -collect get- added p_taub to the autolevels, but we want it to be
* referenced in place of -kendall- within composite result
* '_dtable_test'
collect style autolevels result `auto_results', clear

* style new p-value as you like
collect style cell result[p_taub], nformat(%6.3f) minimum(.001)

* redefine composite result using new p-value result
collect composite define _dtable_test = p_taub regress, trim replace

* replay layout
collect layout

Here is the resulting table.

Code:

-------------------------------------------------------------------------
                                           Car origin
                         Domestic        Foreign         Total      Test
-------------------------------------------------------------------------
N                         52 (70.3%)     22 (29.7%)    74 (100.0%)
Mileage (mpg)         19.827 (4.743) 24.773 (6.611) 21.297 (5.786) <0.001
Turn circle (ft.)     41.442 (3.968) 35.409 (1.501) 39.649 (4.399) <0.001
Trunk space (cu. ft.) 14.750 (4.306) 11.409 (3.217) 13.757 (4.277)  0.002
Repair record 1978
  1                         2 (4.2%)       0 (0.0%)       2 (2.9%) <0.001
  2                        8 (16.7%)       0 (0.0%)      8 (11.6%)
  3                       27 (56.2%)      3 (14.3%)     30 (43.5%)
  4                        9 (18.8%)      9 (42.9%)     18 (26.1%)
  5                         2 (4.2%)      9 (42.9%)     11 (15.9%)
odd
  0                       26 (50.0%)     11 (50.0%)     37 (50.0%)  1.000
  1                       26 (50.0%)     11 (50.0%)     37 (50.0%)
mod3
  0                       17 (32.7%)      7 (31.8%)     24 (32.4%)  0.831
  1                       18 (34.6%)      7 (31.8%)     25 (33.8%)
  2                       17 (32.7%)      8 (36.4%)     25 (33.8%)
-------------------------------------------------------------------------

Comment

Jeff Pitblado (StataCorp)

StataCorp Employee

Join Date: Mar 2014
Posts: 698

13 May 2025, 08:59

When you change the placement of the sample statistics with option place(seplabels), this changes the layout to include a new dimension in the column specification. This new dimension is named _dtable_sample_dim, and its levels are labeled with the sample statistics. You need to make sure to tag your custom p-values with the _hide level.

Here is an example, based on the above (most recent), that places the sample statistics in the column headers with option place(seplabels). I highlight my edits to the original example.

Code:

sysuse auto

gen odd = mod(_n,2)
gen mod3 = mod(_n,3)

unab fvars : rep odd mod
unab colvar : for

dtable mpg turn trunk, by(`colvar', test) ///
    sample(N, statistics(frequency) place(seplabels)) ///
    factor(`fvars', statistics(fvfrequency fvpercent) test(kendall))

* show what dimensions are used in the layout
collect layout
* see all levels of the -by()- variable/dimension;
* note that '_dtable_test' identifies the column labeled "Test"
collect levels `colvar'
* with -place(seplabels)-, you get a new dimension in the column specification;
* the level "_hide" is the one used for the "Test" column
collect levels _dtable_saple_dim
* see autolevels of the result dimension;
* _dtable_stats -- are the continuous and factor variable descriptive stats
* _dtable_test -- are the continuous and factor variable test stats
collect query autolevels result
* keep a copy of these levels, we will want to restore them after
* collecting the p-values
local auto_results = s(levels)
* see definition of composite result for tests;
* we are going to redefine this to use out p-value result instead of 'kendall'
collect query composite _dtable_test

foreach facvar of local fvars {
    * get first level of facvar
    collect levels `facvar'
    local levels `"`s(levels)'"'
    gettoken first : levels
    * compute p-value
    ktau `colvar' `facvar', stats(taub)
    * collect p-value, make sure to tag it properly
    collect get p_taub=(r(p)), ///
        tags(`colvar'[_dtable_test] ///
         var[`first'.`facvar'] ///
         _dtable_sample_dim[_hide] ///
    )
}

* restore the result autolevels;
* -collect get- added p_taub to the autolevels, but we want it to be
* referenced in place of -kendall- within composite result
* '_dtable_test'
collect style autolevels result `auto_results', clear

* style new p-value as you like
collect style cell result[p_taub], nformat(%6.3f) minimum(.001)

* redefine composite result using new p-value result
collect composite define _dtable_test = p_taub regress, trim replace

* replay layout
collect layout

Here is the resulting table.

Code:

-------------------------------------------------------------------------
                                           Car origin
                         Domestic        Foreign         Total      Test
                            52             22             74
-------------------------------------------------------------------------
Mileage (mpg)         19.827 (4.743) 24.773 (6.611) 21.297 (5.786) <0.001
Turn circle (ft.)     41.442 (3.968) 35.409 (1.501) 39.649 (4.399) <0.001
Trunk space (cu. ft.) 14.750 (4.306) 11.409 (3.217) 13.757 (4.277)  0.002
Repair record 1978
  1                         2 (4.2%)       0 (0.0%)       2 (2.9%) <0.001
  2                        8 (16.7%)       0 (0.0%)      8 (11.6%)
  3                       27 (56.2%)      3 (14.3%)     30 (43.5%)
  4                        9 (18.8%)      9 (42.9%)     18 (26.1%)
  5                         2 (4.2%)      9 (42.9%)     11 (15.9%)
odd
  0                       26 (50.0%)     11 (50.0%)     37 (50.0%)  1.000
  1                       26 (50.0%)     11 (50.0%)     37 (50.0%)
mod3
  0                       17 (32.7%)      7 (31.8%)     24 (32.4%)  0.831
  1                       18 (34.6%)      7 (31.8%)     25 (33.8%)
  2                       17 (32.7%)      8 (36.4%)     25 (33.8%)
-------------------------------------------------------------------------

Comment

Jeff Pitblado (StataCorp)

StataCorp Employee

Join Date: Mar 2014

Posts: 698
#6

13 May 2025, 09:01

... consider this a plea for updating dtable sooner than later to natively report Kendall's p-value (which is truly the expected statistic here, I worry that many people will miss this nuance and falsely report the Kendall tau value in their papers instead of the p-value).

We hear you and will do our best.
Comment

Jonathan Afilalo

Join Date: Nov 2016
Posts: 41

14 May 2025, 06:22

Thanks again Jeff, your code works beautifully! One remaining issue if you would be so kind... My custom dtable ado automates the tedious task of parsing a large list of inputted vars into cvars and fvars and it further parses fvars to isolate the binary vars (to automate the task of typing "1." before binary vars so that they are displayed on only 1 line in the table). That said, your code above fails to display the p-value for 1.binary vars such as "female" - see below:

Code:

----------------------------------------------------------------
                           (firstnm) adm_prov_cat               
                0           1           2         Total     Test
             N=2,113       N=23       N=149      N=2,285        
----------------------------------------------------------------
age        74.6 ± 13.7 74.5 ± 14.5 76.3 ± 12.3 74.7 ± 13.7 0.343
urg_triage                                                      
  1            67 (3%)      0 (0%)      0 (0%)     67 (3%) 0.356
  2        1,023 (49%)      0 (0%)     1 (33%) 1,024 (49%)      
  3          883 (42%)    1 (100%)     2 (67%)   886 (42%)      
  4            78 (4%)      0 (0%)      0 (0%)     78 (4%)      
  5            35 (2%)      0 (0%)      0 (0%)     35 (2%)      
female     1,059 (50%)    10 (43%)    82 (55%) 1,151 (50%)      
----------------------------------------------------------------

More generally, I hoped that the automated parsing of vars could be facilitated by (a) some empiric rules - which I have coded, and (b) some user defined prefixes such as "c." before continuous vars, "i." before non-binary factor vars, and "i1." before binary factor vars - which does not work because "factor-variables and time-series operators not allowed". Here is my ado code:

Code:

capture program drop mydtable
program define mydtable, rclass
    version 17.0
    
    /***************************************************************************
    1) Parse syntax
       - varlist: the row variables
       - by(varname): grouping variable
       - np/trend/missing/nototals: your existing flags
       - pdf/docx/xlsx/csv: new flags for export
    ***************************************************************************/
    syntax varlist(min=1) [ , ///
        BY(varname) NP TREND MISSING NOTOTALS ///
        PDF DOCX XLSX CSV ]

    /***************************************************************************
    2) Defaults
    ***************************************************************************/
    local cstats "msd"
    local ctest "regress"
    local ftest "pearson"
    local byopts "tests"

    if "`np'" != "" {
        local cstats "q2 iqi"
        local ctest "kwallis"
    }
    if "`trend'" != "" {
        local ftest "kendall"
    }
    if "`missing'" != "" {
        local byopts "`byopts' missing"
    }
    if "`nototals'" != "" {
        local byopts "`byopts' nototals"
    }
    
    local rowvars "`varlist'"
    local colvar  "`by'"

    /***************************************************************************
    3) Check how many export flags are set
    ***************************************************************************/
    local n_export = ("`pdf'"  != "") + ("`docx'" != "") + ("`xlsx'" != "") + ("`csv'"  != "")
    if `n_export' > 1 {
        di as err "You may specify only one of pdf/docx/xlsx/csv."
        exit 198
    }

    local format ""
    if "`pdf'"  !="" local format "pdf"
    if "`docx'" !="" local format "docx"
    if "`xlsx'" !="" local format "xlsx"
    if "`csv'"  !="" local format "csv"

    /***************************************************************************
    4) Classification macros
    ***************************************************************************/
    local cvars ""
    local fvars ""
    local bvars ""
    local b1vars ""

    foreach v of local rowvars {
        local prefix   = substr("`v'",1,2)
        local stripped = substr("`v'",3,.)

        if inlist("`prefix'","c.","i.","i1.") {
            if "`prefix'"=="c." {
                local cvars "`cvars' `stripped'"
            }
            else if "`prefix'"=="i." {
                local fvars "`fvars' `stripped'"
            }
            else if "`prefix'"=="i1." {
                local bvars "`bvars' `stripped'"
                local b1vars "`b1vars' 1.`stripped'"
            }
        }
        else {
            quietly levelsof `v', local(levels)
            local nlevels = wordcount("`levels'")

            if `nlevels' == 2 {
                local bvars "`bvars' `v'"
                local b1vars "`b1vars' 1.`v'"
            }
            else if `nlevels' >= 3 & `nlevels' <= 9 {
                local fvars "`fvars' `v'"
            }
            else {
                local cvars "`cvars' `v'"
            }
        }
    }

    /***************************************************************************
    5) Debug display (optional)
    ***************************************************************************/
    di "Grouping var: `colvar'"
    di "Continuous  : `cvars'"
    di "Categorical : `fvars'"
    di "Binary      : `bvars'"

    /***************************************************************************
    6) dtable command
    ***************************************************************************/

    dtable, by(`colvar', `byopts') ///
        continuous(`cvars', statistics(`cstats') test(`ctest')) ///
        factor(`fvars' `b1vars', statistics(fvfrequency fvpercent) test(`ftest')) ///
        sample(N, statistics(frequency) place(seplabels)) ///
        define(msd = mean sd, delimiter(" ± ")) ///
        define(iqi = q1 q3, delimiter("-")) ///
        sformat("N=%s" frequency) ///
        sformat("%s" mean sd msd) ///
        sformat("(%s)" iqi) ///
        nformat(%2.0f fvpercent) ///
        nformat(%9.1fc msd q2 iqi) ///
        halign(right)

    // Example: hide raw binary columns
    collect style header `bvars', level(hide)

    // Show preview
    // collect preview

collect layout
collect levels `colvar'
collect levels _dtable_saple_dim
collect query autolevels result
local auto_results = s(levels)
collect query composite _dtable_test
foreach facvar of local fvars {
    collect levels `facvar'
    local levels `"`s(levels)'"'
    gettoken first : levels
    ktau `colvar' `facvar', stats(taub)
    collect get p_taub=(r(p)), tags(`colvar'[_dtable_test] var[`first'.`facvar'] _dtable_sample_dim[_hide])
}
collect style autolevels result `auto_results', clear
collect style cell result[p_taub], nformat(%6.3f) minimum(.001)
collect composite define _dtable_test = p_taub regress, trim replace
collect layout

    /***************************************************************************
    7) If user requested pdf/docx/xlsx/csv, do collect export
    ***************************************************************************/
    if `"`format'"' != "" {
        collect export "test.`format'", replace
    }
end

Comment

Nick Cox

Join Date: Mar 2014

Posts: 35685
#8

14 May 2025, 08:37

Interesting to learn that there is some risk of confusing rank correlations with P-values.

Let's hope that negative correlations particularly don't get reported that way.

I suppose that in fields where these measures are popular that there is a real risk that a correlation of 0.01 will be taken as a strong result.

(Rhetorical question: Don't people look at graphs too?)
Comment
Jonathan Afilalo

Join Date: Nov 2016

Posts: 41
#9

14 May 2025, 12:49

Exactly. The confusion arises because regress and other tests in dtable report the p-value in the test column. Kendall reports the tau in the test column. If the dtable contains a combination of cvars and fvars, then the test column will contain a mix of p-values and rank correlations. Novice researchers may not pick up on this or question it especially if the tau is positive and "looks like" a plausible p-value.
Comment

Jeff Pitblado (StataCorp)

StataCorp Employee

Join Date: Mar 2014
Posts: 698

#10

14 May 2025, 14:20

Your program loops over the list of variables in fvars but not the ones in bvars.

In your loop over rowvars, I see your references to factor-variable operators c. and i., but your syntax command is missing the fv modifier in your varlist() specificiation. When you add fv, the c. operator will not be preserved in the parsed variables list, and you will need to check for interactions. To simplify the loop over rowvars, I suggest you use fvexpand to create the rowvars list, then loop over the elements to look for the dot operator and handle the base b. operator.

fvexpand will place the factor levels in numeric order, so a 0/1 variable f will be expanded from i.f to 0b.f 1.f, thus you can loop over the individual factor level variables in rowvars and let the 1. operator move f from fvars to bvars (and b1vars) and any other level value move it back to fvars.

Here is a list of the changes I would make to your mydtable program:

change your program's version to 18.0, when dtable was introduced
add fv modifier to varlist() in the syntax command
make option by() required, your code assumes it
check for interactions, exit with error if found
use fvexpand to construct the rowvars list
use the level (op) of operated variables in rowvars to assign them to lists fvars and bvars/b1vars
use levelsof only for non-negative integer-valued variables; only put 0/1 valued variables in the bvars/b1vars lists
require option trend for the p_taub loop over the fvars
add the p_taub loop for the bvars
fix composite result _dtable_test definition to keep the continuous test p-value result

In the following I highlight my changes in blue and ususally with an edit: comment.

Code:

program define mydtable, rclass
    version 18.0    // edit: dtable introduced in Stata 18

    /***************************************************************************
    1) Parse syntax
       - varlist: the row variables
       - by(varname): grouping variable
       - np/trend/missing/nototals: your existing flags
       - pdf/docx/xlsx/csv: new flags for export
    ***************************************************************************/
    syntax varlist(min=1 fv) , /// edit: allow factor variables notation
        BY(varname) [ NP TREND MISSING NOTOTALS /// edit: by() is required
        PDF DOCX XLSX CSV ]

    * edit: disallow interactions
    local sharp : subinstr local varlist "#" "?", all count(local k_sharp)
    if `k_sharp' {
        di as err "interactions not allowed"
        exit 198
    }

    /***************************************************************************
    2) Defaults
    ***************************************************************************/
    local cstats "msd"
    local ctest "regress"
    local ftest "pearson"
    local byopts "tests"

    if "`np'" != "" {
        local cstats "q2 iqi"
        local ctest "kwallis"
    }
    if "`trend'" != "" {
        local ftest "kendall"
    }
    if "`missing'" != "" {
        local byopts "`byopts' missing"
    }
    if "`nototals'" != "" {
        local byopts "`byopts' nototals"
    }

    * edit: expand factor variables to their factor-level form
    fvexpand `varlist'
    local rowvars "`r(varlist)'"
    local colvar  "`by'"

    /***************************************************************************
    3) Check how many export flags are set
    ***************************************************************************/
    local n_export = ("`pdf'"  != "") + ("`docx'" != "") + ("`xlsx'" != "") + ("`csv'"  != "")
    if `n_export' > 1 {
        di as err "You may specify only one of pdf/docx/xlsx/csv."
        exit 198
    }

    local format ""
    if "`pdf'"  !="" local format "pdf"
    if "`docx'" !="" local format "docx"
    if "`xlsx'" !="" local format "xlsx"
    if "`csv'"  !="" local format "csv"

    /***************************************************************************
    4) Classification macros
    ***************************************************************************/
    local cvars ""
    local fvars ""
    local bvars ""
    local b1vars ""

    foreach v of local rowvars {
        * edit: handle FV operator
        local dot = strpos("`v'", ".")
        if `dot' {
            local op = substr("`v'", 1, `dot'-1)
            if strmatch("`op'","*b") {
                // ignore base operator
                local op = substr("`v'", 1, `dot'-2)
            }
            local name = substr("`v'", `dot'+1, .)
            if `op' == 1 {
                local bvars : list bvars | name
                local b1vars "`b1vars' 1.`name'"
                local fvars : list fvars - name
            }
            else {
                local fvars : list fvars | name
                local bvars : list bvars - name
                local 1name 1.`name'
                local b1vars : list b1vars - 1name
            }
        }
        else {
            * edit: check for non-negative integer value variables
            capture assert `v' == floor(`v') & `v' >= 0
            if c(rc) {
                local cvars "`cvars' `v'"
            }
            else {
                quietly levelsof `v', local(levels)
                local nlevels = wordcount("`levels'")
    
                * edit: identify 0/1 variables
                if `"`levels'"' == "0 1" {
                    local bvars "`bvars' `v'"
                    local b1vars "`b1vars' 1.`v'"
                }
                else if `nlevels' <= 9 {
                    local fvars "`fvars' `v'"
                }
                else {
                    local cvars "`cvars' `v'"
                }
            }
        }
    }

    /***************************************************************************
    5) Debug display (optional)
    ***************************************************************************/
    di "Grouping var: `colvar'"
    di "Continuous  : `cvars'"
    di "Categorical : `fvars'"
    di "Binary      : `bvars'"

    /***************************************************************************
    6) dtable command
    ***************************************************************************/

    dtable, by(`colvar', `byopts') ///
        continuous(`cvars', statistics(`cstats') test(`ctest')) ///
        factor(`fvars' `b1vars', statistics(fvfrequency fvpercent) test(`ftest')) ///
        sample(N, statistics(frequency) place(seplabels)) ///
        define(msd = mean sd, delimiter(" ± ")) ///
        define(iqi = q1 q3, delimiter("-")) ///
        sformat("N=%s" frequency) ///
        sformat("%s" mean sd msd) ///
        sformat("(%s)" iqi) ///
        nformat(%2.0f fvpercent) ///
        nformat(%9.1fc msd q2 iqi) ///
        halign(right)

    // Example: hide raw binary columns
    collect style header `bvars', level(hide)

    // Show preview
    // collect preview

    * edit: replace kendall with p_taub
    if "`trend'" != "" {
        *collect layout
        *collect levels `colvar'
        *collect levels _dtable_saple_dim
        quietly collect query autolevels result
        local auto_results = s(levels)
        *collect query composite _dtable_test
        foreach facvar of local fvars {
            quietly collect levels `facvar'
            local levels `"`s(levels)'"'
            gettoken first : levels
            quietly ktau `colvar' `facvar', stats(taub)
            collect get p_taub=(r(p)), ///
                tags(`colvar'[_dtable_test] var[`first'.`facvar'] _dtable_sample_dim[_hide])
        }
        * edit: add loop for 0/1 variables
        foreach binvar of local bvars {
            quietly ktau `colvar' `binvar', stats(taub)
            collect get p_taub=(r(p)), ///
                tags(`colvar'[_dtable_test] var[1.`binvar'] _dtable_sample_dim[_hide])
        }
        collect style autolevels result `auto_results', clear
        collect style cell result[p_taub], nformat(%6.3f) minimum(.001)
        * edit: keep continuous test result
        collect composite define _dtable_test = p_taub `ctest', trim replace
    }
    collect layout

    /***************************************************************************
    7) If user requested pdf/docx/xlsx/csv, do collect export
    ***************************************************************************/
    if `"`format'"' != "" {
        collect export "test.`format'", replace
    }
end

Here is the command in action without factor variable operators. In the specified variables list

the 0/1 variables are expensive and odd
headroom has 8 unique values, but is not integer valued
rep78 and mod2 meet the factor variable requirements
all other variables are recognized as continuous

Code:

sysuse auto
gen odd = mod(_n,2)
gen mod2 = mod(_n,3)
gen expensive = price > 12000

mydtable expensive mpg rep78 headroom trunk turn gear odd mod2, by(for) trend

Here is the resulting table.

Code:

-------------------------------------------------------------
                                     Car origin              
                       Domestic    Foreign     Total    Test 
                         N=52       N=22       N=74          
-------------------------------------------------------------
Mileage (mpg)         19.8 ± 4.7 24.8 ± 6.6 21.3 ± 5.8 <0.001
Headroom (in.)         3.2 ± 0.9  2.6 ± 0.5  3.0 ± 0.8  0.011
Trunk space (cu. ft.) 14.8 ± 4.3 11.4 ± 3.2 13.8 ± 4.3  0.002
Turn circle (ft.)     41.4 ± 4.0 35.4 ± 1.5 39.6 ± 4.4 <0.001
Gear ratio             2.8 ± 0.3  3.5 ± 0.3  3.0 ± 0.5 <0.001
Repair record 1978                                           
  1                       2 (4%)     0 (0%)     2 (3%) <0.001
  2                      8 (17%)     0 (0%)    8 (12%)       
  3                     27 (56%)    3 (14%)   30 (43%)       
  4                      9 (19%)    9 (43%)   18 (26%)       
  5                       2 (4%)    9 (43%)   11 (16%)       
mod2                                                         
  0                     17 (33%)    7 (32%)   24 (32%)  0.831
  1                     18 (35%)    7 (32%)   25 (34%)       
  2                     17 (33%)    8 (36%)   25 (34%)       
expensive                 4 (8%)     1 (5%)     5 (7%)  0.634
odd                     26 (50%)   11 (50%)   37 (50%)  1.000
-------------------------------------------------------------

Here is an example with factor variable operators. In the specified variables list

use 1. on variable expensive, saves us from running levelsof for this variable
use i(1 4). on variable rep78, the 4 causes rep78 to be put in the fvars list, otherwise the levels are ignored
use i. on odd, which expands to 0b.odd 1.odd so odd will be put in the bvars/b1vars lists
use 1. on mod2, forces it to be treated as-if it were a 0/1 variable, so it will be put in the bvars/b1vars lists; however, all its values will be used to compute the p_taub p-value

Code:

mydtable 1.expensive mpg i(1 4).rep78 headroom trunk turn gear i.odd 1.mod2, by(for) trend

Here is the resulting table.

Code:

-------------------------------------------------------------
                                     Car origin
                       Domestic    Foreign     Total    Test
                         N=52       N=22       N=74
-------------------------------------------------------------
Mileage (mpg)         19.8 ± 4.7 24.8 ± 6.6 21.3 ± 5.8 <0.001
Headroom (in.)         3.2 ± 0.9  2.6 ± 0.5  3.0 ± 0.8  0.011
Trunk space (cu. ft.) 14.8 ± 4.3 11.4 ± 3.2 13.8 ± 4.3  0.002
Turn circle (ft.)     41.4 ± 4.0 35.4 ± 1.5 39.6 ± 4.4 <0.001
Gear ratio             2.8 ± 0.3  3.5 ± 0.3  3.0 ± 0.5 <0.001
Repair record 1978
  1                       2 (4%)     0 (0%)     2 (3%) <0.001
  2                      8 (17%)     0 (0%)    8 (12%)
  3                     27 (56%)    3 (14%)   30 (43%)
  4                      9 (19%)    9 (43%)   18 (26%)
  5                       2 (4%)    9 (43%)   11 (16%)
expensive                 4 (8%)     1 (5%)     5 (7%)  0.634
odd                     26 (50%)   11 (50%)   37 (50%)  1.000
mod2                    18 (35%)    7 (32%)   25 (34%)  0.831
-------------------------------------------------------------

Comment

Jonathan Afilalo

Join Date: Nov 2016

Posts: 41
#11

15 May 2025, 10:06

WOW! It works like a charm, and more so than the good feeling of working code, I am grateful for your generous help and super expertise! Hopefully this ado will be useful to others...
Comment

Announcement