Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • dtable kendall BUG

    Hi, I believe there is a bug in dtable whereby ftest(kendall) returns Kendall's tau-b instead of the Prob > |z| (i.e. p-value). All of the other tests return p-values not coefficients.

    Code:
    dtable, by(ambulance_yesno) factor(triage_code, statistics(fvfrequency fvpercent) test(kendall))
    Returns a test value of -0.10.

    Code:
    ktau ambulance_yesno triage_code
    Returns a Kendall's tau-b of -0.10 and a p-value of 0.6047.

    Can this bug be fixed so that kendall within dtable returns the p-value?

  • #2
    dtable documents that it reports Kendall's τ_b, not a p-value.
    This test statistic corresponds with the value reported by tabulate with option taub.

    We will consider updating dtable to support a new ftest name for the p-value that corresponds with Kendall's τ_b.

    In the mean time, if you want the p-values from ktau in your tables, you can add them to your collection and update the collection to show your p-value instead of Kendall's τ_b.

    Here is an example of how you can do this.
    Code:
    sysuse auto
    
    dtable, by(foreign, test) ///
        factor(rep78, statistics(fvfrequency fvpercent) test(kendall))
    
    * show what dimensions are used in the layout
    collect layout
    * see all levels of the -by()- variable/dimension;
    * note that '_dtable_test' identifies the column labeled "Test"
    collect levels foreign
    * see autolevels of the result dimension;
    * _dtable_stats -- are the continuous and factor variable descriptive stats
    * _dtable_test -- are the continuous and factor variable test stats
    collect query autolevels result
    * keep a copy of these levels, we will want to restore them after
    * collecting the p-values
    local auto_results = s(levels)
    * see definition of composite result for tests;
    * we are going to redefine this to use out p-value result instead of 'kendall'
    collect query composite _dtable_test
    
    * compute p-value
    ktau foreign rep78, stats(taub)
    * collect p-value, make sure to tag it properly
    collect get p_taub=(r(p)), tags(foreign[_dtable_test] var[1.rep78])
    
    * restore the result autolevels;
    * -collect get- added p_taub to the autolevels, but we want it to be
    * referenced in place of -kendall- within composite result
    * '_dtable_test' 
    collect style autolevels result `auto_results', clear
    
    * style new p-value as you like
    collect style cell result[p_taub], nformat(%6.3f) minimum(.001)
    
    * redefine composite result using new p-value result
    collect composite define _dtable_test = p_taub, trim replace
    
    * replay layout
    collect layout
    Here is the resulting table.
    Code:
    -----------------------------------------------------------
                                      Car origin
                        Domestic    Foreign     Total     Test
    -----------------------------------------------------------
    N                  52 (70.3%) 22 (29.7%) 74 (100.0%)
    Repair record 1978
      1                  2 (4.2%)   0 (0.0%)    2 (2.9%) <0.001
      2                 8 (16.7%)   0 (0.0%)   8 (11.6%)
      3                27 (56.2%)  3 (14.3%)  30 (43.5%)
      4                 9 (18.8%)  9 (42.9%)  18 (26.1%)
      5                  2 (4.2%)  9 (42.9%)  11 (15.9%)
    -----------------------------------------------------------

    Comment


    • #3
      Thanks Jeff, the code you shared works, but it breaks down when my dtable contains multiple factor vars or a mix of factor vars and continuous vars. I tried modifying as such but unsuccessful:

      Code:
      collect levels `colvar'
      collect query autolevels result
      local autoresults = s(levels)
      collect query composite _dtable_test
      foreach facvar of local fvars{
          ktau `colvar' `facvar', stats(taub)
          collect get p_taub=(r(p)), tags(`colvar'[_dtable_test] var[1.`facvar'])
      }
      collect style autolevels result `autoresults', clear
      collect style cell result[p_taub], nformat(%6.3f) minimum(.001)
      collect composite define _dtable_test = p_taub, trim replace
      collect layout
      It also breaks down when my dtable structure is ever so slightly altered, for example:

      Code:
      sample(N, statistics(frequency) place(seplabels))
      does not display the p-value after your code, whereas
      Code:
      sample(N, statistics(frequency) place(inlabels))
      works well.

      Is there a more robust solution, or better yet, consider this a plea for updating dtable sooner than later to natively report Kendall's p-value (which is truly the expected statistic here, I worry that many people will miss this nuance and falsely report the Kendall tau value in their papers instead of the p-value).

      Comment


      • #4
        How does it break down? Did you keep the regress result when you redefined composite result _dtable_test?

        Here is an example where I add 2 more factor variables and some continuous variables. I highlight my edits to the original example.
        Code:
        sysuse auto
        
        gen odd = mod(_n,2)
        gen mod3 = mod(_n,3)
        
        unab fvars : rep odd mod
        unab colvar : for
        
        dtable mpg turn trunk, by(`colvar', test) ///
            factor(`fvars', statistics(fvfrequency fvpercent) test(kendall))
        
        * show what dimensions are used in the layout
        collect layout
        * see all levels of the -by()- variable/dimension;
        * note that '_dtable_test' identifies the column labeled "Test"
        collect levels `colvar'
        * see autolevels of the result dimension;
        * _dtable_stats -- are the continuous and factor variable descriptive stats
        * _dtable_test -- are the continuous and factor variable test stats
        collect query autolevels result
        * keep a copy of these levels, we will want to restore them after
        * collecting the p-values
        local auto_results = s(levels)
        * see definition of composite result for tests;
        * we are going to redefine this to use out p-value result instead of 'kendall'
        collect query composite _dtable_test
        
        foreach facvar of local fvars {
            * get first level of facvar
            collect levels `facvar'
            local levels `"`s(levels)'"'
            gettoken first : levels
            * compute p-value
            ktau `colvar' `facvar', stats(taub)
            * collect p-value, make sure to tag it properly
            collect get p_taub=(r(p)), ///
                tags(`colvar'[_dtable_test] var[`first'.`facvar'])
        }
        
        * restore the result autolevels;
        * -collect get- added p_taub to the autolevels, but we want it to be
        * referenced in place of -kendall- within composite result
        * '_dtable_test'
        collect style autolevels result `auto_results', clear
        
        * style new p-value as you like
        collect style cell result[p_taub], nformat(%6.3f) minimum(.001)
        
        * redefine composite result using new p-value result
        collect composite define _dtable_test = p_taub regress, trim replace
        
        * replay layout
        collect layout
        Here is the resulting table.
        Code:
        -------------------------------------------------------------------------
                                                   Car origin
                                 Domestic        Foreign         Total      Test
        -------------------------------------------------------------------------
        N                         52 (70.3%)     22 (29.7%)    74 (100.0%)
        Mileage (mpg)         19.827 (4.743) 24.773 (6.611) 21.297 (5.786) <0.001
        Turn circle (ft.)     41.442 (3.968) 35.409 (1.501) 39.649 (4.399) <0.001
        Trunk space (cu. ft.) 14.750 (4.306) 11.409 (3.217) 13.757 (4.277)  0.002
        Repair record 1978
          1                         2 (4.2%)       0 (0.0%)       2 (2.9%) <0.001
          2                        8 (16.7%)       0 (0.0%)      8 (11.6%)
          3                       27 (56.2%)      3 (14.3%)     30 (43.5%)
          4                        9 (18.8%)      9 (42.9%)     18 (26.1%)
          5                         2 (4.2%)      9 (42.9%)     11 (15.9%)
        odd
          0                       26 (50.0%)     11 (50.0%)     37 (50.0%)  1.000
          1                       26 (50.0%)     11 (50.0%)     37 (50.0%)
        mod3
          0                       17 (32.7%)      7 (31.8%)     24 (32.4%)  0.831
          1                       18 (34.6%)      7 (31.8%)     25 (33.8%)
          2                       17 (32.7%)      8 (36.4%)     25 (33.8%)
        -------------------------------------------------------------------------

        Comment


        • #5
          When you change the placement of the sample statistics with option place(seplabels), this changes the layout to include a new dimension in the column specification. This new dimension is named _dtable_sample_dim, and its levels are labeled with the sample statistics. You need to make sure to tag your custom p-values with the _hide level.

          Here is an example, based on the above (most recent), that places the sample statistics in the column headers with option place(seplabels). I highlight my edits to the original example.
          Code:
          sysuse auto
          
          gen odd = mod(_n,2)
          gen mod3 = mod(_n,3)
          
          unab fvars : rep odd mod
          unab colvar : for
          
          dtable mpg turn trunk, by(`colvar', test) ///
              sample(N, statistics(frequency) place(seplabels)) ///
              factor(`fvars', statistics(fvfrequency fvpercent) test(kendall))
          
          * show what dimensions are used in the layout
          collect layout
          * see all levels of the -by()- variable/dimension;
          * note that '_dtable_test' identifies the column labeled "Test"
          collect levels `colvar'
          * with -place(seplabels)-, you get a new dimension in the column specification;
          * the level "_hide" is the one used for the "Test" column
          collect levels _dtable_saple_dim
          * see autolevels of the result dimension;
          * _dtable_stats -- are the continuous and factor variable descriptive stats
          * _dtable_test -- are the continuous and factor variable test stats
          collect query autolevels result
          * keep a copy of these levels, we will want to restore them after
          * collecting the p-values
          local auto_results = s(levels)
          * see definition of composite result for tests;
          * we are going to redefine this to use out p-value result instead of 'kendall'
          collect query composite _dtable_test
          
          foreach facvar of local fvars {
              * get first level of facvar
              collect levels `facvar'
              local levels `"`s(levels)'"'
              gettoken first : levels
              * compute p-value
              ktau `colvar' `facvar', stats(taub)
              * collect p-value, make sure to tag it properly
              collect get p_taub=(r(p)), ///
                  tags(`colvar'[_dtable_test] ///
                   var[`first'.`facvar'] ///
                   _dtable_sample_dim[_hide] ///
              )
          }
          
          * restore the result autolevels;
          * -collect get- added p_taub to the autolevels, but we want it to be
          * referenced in place of -kendall- within composite result
          * '_dtable_test'
          collect style autolevels result `auto_results', clear
          
          * style new p-value as you like
          collect style cell result[p_taub], nformat(%6.3f) minimum(.001)
          
          * redefine composite result using new p-value result
          collect composite define _dtable_test = p_taub regress, trim replace
          
          * replay layout
          collect layout
          Here is the resulting table.
          Code:
          -------------------------------------------------------------------------
                                                     Car origin
                                   Domestic        Foreign         Total      Test
                                      52             22             74
          -------------------------------------------------------------------------
          Mileage (mpg)         19.827 (4.743) 24.773 (6.611) 21.297 (5.786) <0.001
          Turn circle (ft.)     41.442 (3.968) 35.409 (1.501) 39.649 (4.399) <0.001
          Trunk space (cu. ft.) 14.750 (4.306) 11.409 (3.217) 13.757 (4.277)  0.002
          Repair record 1978
            1                         2 (4.2%)       0 (0.0%)       2 (2.9%) <0.001
            2                        8 (16.7%)       0 (0.0%)      8 (11.6%)
            3                       27 (56.2%)      3 (14.3%)     30 (43.5%)
            4                        9 (18.8%)      9 (42.9%)     18 (26.1%)
            5                         2 (4.2%)      9 (42.9%)     11 (15.9%)
          odd
            0                       26 (50.0%)     11 (50.0%)     37 (50.0%)  1.000
            1                       26 (50.0%)     11 (50.0%)     37 (50.0%)
          mod3
            0                       17 (32.7%)      7 (31.8%)     24 (32.4%)  0.831
            1                       18 (34.6%)      7 (31.8%)     25 (33.8%)
            2                       17 (32.7%)      8 (36.4%)     25 (33.8%)
          -------------------------------------------------------------------------

          Comment


          • #6
            ... consider this a plea for updating dtable sooner than later to natively report Kendall's p-value (which is truly the expected statistic here, I worry that many people will miss this nuance and falsely report the Kendall tau value in their papers instead of the p-value).
            We hear you and will do our best.

            Comment


            • #7
              Thanks again Jeff, your code works beautifully! One remaining issue if you would be so kind... My custom dtable ado automates the tedious task of parsing a large list of inputted vars into cvars and fvars and it further parses fvars to isolate the binary vars (to automate the task of typing "1." before binary vars so that they are displayed on only 1 line in the table). That said, your code above fails to display the p-value for 1.binary vars such as "female" - see below:

              Code:
              ----------------------------------------------------------------
                                         (firstnm) adm_prov_cat               
                              0           1           2         Total     Test
                           N=2,113       N=23       N=149      N=2,285        
              ----------------------------------------------------------------
              age        74.6 ± 13.7 74.5 ± 14.5 76.3 ± 12.3 74.7 ± 13.7 0.343
              urg_triage                                                      
                1            67 (3%)      0 (0%)      0 (0%)     67 (3%) 0.356
                2        1,023 (49%)      0 (0%)     1 (33%) 1,024 (49%)      
                3          883 (42%)    1 (100%)     2 (67%)   886 (42%)      
                4            78 (4%)      0 (0%)      0 (0%)     78 (4%)      
                5            35 (2%)      0 (0%)      0 (0%)     35 (2%)      
              female     1,059 (50%)    10 (43%)    82 (55%) 1,151 (50%)      
              ----------------------------------------------------------------
              More generally, I hoped that the automated parsing of vars could be facilitated by (a) some empiric rules - which I have coded, and (b) some user defined prefixes such as "c." before continuous vars, "i." before non-binary factor vars, and "i1." before binary factor vars - which does not work because "factor-variables and time-series operators not allowed". Here is my ado code:

              Code:
              capture program drop mydtable
              program define mydtable, rclass
                  version 17.0
                  
                  /***************************************************************************
                  1) Parse syntax
                     - varlist: the row variables
                     - by(varname): grouping variable
                     - np/trend/missing/nototals: your existing flags
                     - pdf/docx/xlsx/csv: new flags for export
                  ***************************************************************************/
                  syntax varlist(min=1) [ , ///
                      BY(varname) NP TREND MISSING NOTOTALS ///
                      PDF DOCX XLSX CSV ]
              
                  /***************************************************************************
                  2) Defaults
                  ***************************************************************************/
                  local cstats "msd"
                  local ctest "regress"
                  local ftest "pearson"
                  local byopts "tests"
              
                  if "`np'" != "" {
                      local cstats "q2 iqi"
                      local ctest "kwallis"
                  }
                  if "`trend'" != "" {
                      local ftest "kendall"
                  }
                  if "`missing'" != "" {
                      local byopts "`byopts' missing"
                  }
                  if "`nototals'" != "" {
                      local byopts "`byopts' nototals"
                  }
                  
                  local rowvars "`varlist'"
                  local colvar  "`by'"
              
                  /***************************************************************************
                  3) Check how many export flags are set
                  ***************************************************************************/
                  local n_export = ("`pdf'"  != "") + ("`docx'" != "") + ("`xlsx'" != "") + ("`csv'"  != "")
                  if `n_export' > 1 {
                      di as err "You may specify only one of pdf/docx/xlsx/csv."
                      exit 198
                  }
              
                  local format ""
                  if "`pdf'"  !="" local format "pdf"
                  if "`docx'" !="" local format "docx"
                  if "`xlsx'" !="" local format "xlsx"
                  if "`csv'"  !="" local format "csv"
              
                  /***************************************************************************
                  4) Classification macros
                  ***************************************************************************/
                  local cvars ""
                  local fvars ""
                  local bvars ""
                  local b1vars ""
              
                  foreach v of local rowvars {
                      local prefix   = substr("`v'",1,2)
                      local stripped = substr("`v'",3,.)
              
                      if inlist("`prefix'","c.","i.","i1.") {
                          if "`prefix'"=="c." {
                              local cvars "`cvars' `stripped'"
                          }
                          else if "`prefix'"=="i." {
                              local fvars "`fvars' `stripped'"
                          }
                          else if "`prefix'"=="i1." {
                              local bvars "`bvars' `stripped'"
                              local b1vars "`b1vars' 1.`stripped'"
                          }
                      }
                      else {
                          quietly levelsof `v', local(levels)
                          local nlevels = wordcount("`levels'")
              
                          if `nlevels' == 2 {
                              local bvars "`bvars' `v'"
                              local b1vars "`b1vars' 1.`v'"
                          }
                          else if `nlevels' >= 3 & `nlevels' <= 9 {
                              local fvars "`fvars' `v'"
                          }
                          else {
                              local cvars "`cvars' `v'"
                          }
                      }
                  }
              
                  /***************************************************************************
                  5) Debug display (optional)
                  ***************************************************************************/
                  di "Grouping var: `colvar'"
                  di "Continuous  : `cvars'"
                  di "Categorical : `fvars'"
                  di "Binary      : `bvars'"
              
                  /***************************************************************************
                  6) dtable command
                  ***************************************************************************/
              
                  dtable, by(`colvar', `byopts') ///
                      continuous(`cvars', statistics(`cstats') test(`ctest')) ///
                      factor(`fvars' `b1vars', statistics(fvfrequency fvpercent) test(`ftest')) ///
                      sample(N, statistics(frequency) place(seplabels)) ///
                      define(msd = mean sd, delimiter(" ± ")) ///
                      define(iqi = q1 q3, delimiter("-")) ///
                      sformat("N=%s" frequency) ///
                      sformat("%s" mean sd msd) ///
                      sformat("(%s)" iqi) ///
                      nformat(%2.0f fvpercent) ///
                      nformat(%9.1fc msd q2 iqi) ///
                      halign(right)
              
                  // Example: hide raw binary columns
                  collect style header `bvars', level(hide)
              
                  // Show preview
                  // collect preview
              
              collect layout
              collect levels `colvar'
              collect levels _dtable_saple_dim
              collect query autolevels result
              local auto_results = s(levels)
              collect query composite _dtable_test
              foreach facvar of local fvars {
                  collect levels `facvar'
                  local levels `"`s(levels)'"'
                  gettoken first : levels
                  ktau `colvar' `facvar', stats(taub)
                  collect get p_taub=(r(p)), tags(`colvar'[_dtable_test] var[`first'.`facvar'] _dtable_sample_dim[_hide])
              }
              collect style autolevels result `auto_results', clear
              collect style cell result[p_taub], nformat(%6.3f) minimum(.001)
              collect composite define _dtable_test = p_taub regress, trim replace
              collect layout
              
                  /***************************************************************************
                  7) If user requested pdf/docx/xlsx/csv, do collect export
                  ***************************************************************************/
                  if `"`format'"' != "" {
                      collect export "test.`format'", replace
                  }
              end

              Comment


              • #8
                Interesting to learn that there is some risk of confusing rank correlations with P-values.

                Let's hope that negative correlations particularly don't get reported that way.

                I suppose that in fields where these measures are popular that there is a real risk that a correlation of 0.01 will be taken as a strong result.

                (Rhetorical question: Don't people look at graphs too?)

                Comment


                • #9
                  Exactly. The confusion arises because regress and other tests in dtable report the p-value in the test column. Kendall reports the tau in the test column. If the dtable contains a combination of cvars and fvars, then the test column will contain a mix of p-values and rank correlations. Novice researchers may not pick up on this or question it especially if the tau is positive and "looks like" a plausible p-value.

                  Comment


                  • #10
                    Your program loops over the list of variables in fvars but not the ones in bvars.

                    In your loop over rowvars, I see your references to factor-variable operators c. and i., but your syntax command is missing the fv modifier in your varlist() specificiation. When you add fv, the c. operator will not be preserved in the parsed variables list, and you will need to check for interactions. To simplify the loop over rowvars, I suggest you use fvexpand to create the rowvars list, then loop over the elements to look for the dot operator and handle the base b. operator.

                    fvexpand will place the factor levels in numeric order, so a 0/1 variable f will be expanded from i.f to 0b.f 1.f, thus you can loop over the individual factor level variables in rowvars and let the 1. operator move f from fvars to bvars (and b1vars) and any other level value move it back to fvars.

                    Here is a list of the changes I would make to your mydtable program:
                    1. change your program's version to 18.0, when dtable was introduced
                    2. add fv modifier to varlist() in the syntax command
                    3. make option by() required, your code assumes it
                    4. check for interactions, exit with error if found
                    5. use fvexpand to construct the rowvars list
                    6. use the level (op) of operated variables in rowvars to assign them to lists fvars and bvars/b1vars
                    7. use levelsof only for non-negative integer-valued variables; only put 0/1 valued variables in the bvars/b1vars lists
                    8. require option trend for the p_taub loop over the fvars
                    9. add the p_taub loop for the bvars
                    10. fix composite result _dtable_test definition to keep the continuous test p-value result

                    In the following I highlight my changes in blue and ususally with an edit: comment.
                    Code:
                    program define mydtable, rclass
                        version 18.0    // edit: dtable introduced in Stata 18
                    
                        /***************************************************************************
                        1) Parse syntax
                           - varlist: the row variables
                           - by(varname): grouping variable
                           - np/trend/missing/nototals: your existing flags
                           - pdf/docx/xlsx/csv: new flags for export
                        ***************************************************************************/
                        syntax varlist(min=1 fv) , /// edit: allow factor variables notation
                            BY(varname) [ NP TREND MISSING NOTOTALS /// edit: by() is required
                            PDF DOCX XLSX CSV ]
                    
                        * edit: disallow interactions
                        local sharp : subinstr local varlist "#" "?", all count(local k_sharp)
                        if `k_sharp' {
                            di as err "interactions not allowed"
                            exit 198
                        }
                    
                        /***************************************************************************
                        2) Defaults
                        ***************************************************************************/
                        local cstats "msd"
                        local ctest "regress"
                        local ftest "pearson"
                        local byopts "tests"
                    
                        if "`np'" != "" {
                            local cstats "q2 iqi"
                            local ctest "kwallis"
                        }
                        if "`trend'" != "" {
                            local ftest "kendall"
                        }
                        if "`missing'" != "" {
                            local byopts "`byopts' missing"
                        }
                        if "`nototals'" != "" {
                            local byopts "`byopts' nototals"
                        }
                    
                        * edit: expand factor variables to their factor-level form
                        fvexpand `varlist'
                        local rowvars "`r(varlist)'"
                        local colvar  "`by'"
                    
                        /***************************************************************************
                        3) Check how many export flags are set
                        ***************************************************************************/
                        local n_export = ("`pdf'"  != "") + ("`docx'" != "") + ("`xlsx'" != "") + ("`csv'"  != "")
                        if `n_export' > 1 {
                            di as err "You may specify only one of pdf/docx/xlsx/csv."
                            exit 198
                        }
                    
                        local format ""
                        if "`pdf'"  !="" local format "pdf"
                        if "`docx'" !="" local format "docx"
                        if "`xlsx'" !="" local format "xlsx"
                        if "`csv'"  !="" local format "csv"
                    
                        /***************************************************************************
                        4) Classification macros
                        ***************************************************************************/
                        local cvars ""
                        local fvars ""
                        local bvars ""
                        local b1vars ""
                    
                        foreach v of local rowvars {
                            * edit: handle FV operator
                            local dot = strpos("`v'", ".")
                            if `dot' {
                                local op = substr("`v'", 1, `dot'-1)
                                if strmatch("`op'","*b") {
                                    // ignore base operator
                                    local op = substr("`v'", 1, `dot'-2)
                                }
                                local name = substr("`v'", `dot'+1, .)
                                if `op' == 1 {
                                    local bvars : list bvars | name
                                    local b1vars "`b1vars' 1.`name'"
                                    local fvars : list fvars - name
                                }
                                else {
                                    local fvars : list fvars | name
                                    local bvars : list bvars - name
                                    local 1name 1.`name'
                                    local b1vars : list b1vars - 1name
                                }
                            }
                            else {
                                * edit: check for non-negative integer value variables
                                capture assert `v' == floor(`v') & `v' >= 0
                                if c(rc) {
                                    local cvars "`cvars' `v'"
                                }
                                else {
                                    quietly levelsof `v', local(levels)
                                    local nlevels = wordcount("`levels'")
                        
                                    * edit: identify 0/1 variables
                                    if `"`levels'"' == "0 1" {
                                        local bvars "`bvars' `v'"
                                        local b1vars "`b1vars' 1.`v'"
                                    }
                                    else if `nlevels' <= 9 {
                                        local fvars "`fvars' `v'"
                                    }
                                    else {
                                        local cvars "`cvars' `v'"
                                    }
                                }
                            }
                        }
                    
                        /***************************************************************************
                        5) Debug display (optional)
                        ***************************************************************************/
                        di "Grouping var: `colvar'"
                        di "Continuous  : `cvars'"
                        di "Categorical : `fvars'"
                        di "Binary      : `bvars'"
                    
                        /***************************************************************************
                        6) dtable command
                        ***************************************************************************/
                    
                        dtable, by(`colvar', `byopts') ///
                            continuous(`cvars', statistics(`cstats') test(`ctest')) ///
                            factor(`fvars' `b1vars', statistics(fvfrequency fvpercent) test(`ftest')) ///
                            sample(N, statistics(frequency) place(seplabels)) ///
                            define(msd = mean sd, delimiter(" ± ")) ///
                            define(iqi = q1 q3, delimiter("-")) ///
                            sformat("N=%s" frequency) ///
                            sformat("%s" mean sd msd) ///
                            sformat("(%s)" iqi) ///
                            nformat(%2.0f fvpercent) ///
                            nformat(%9.1fc msd q2 iqi) ///
                            halign(right)
                    
                        // Example: hide raw binary columns
                        collect style header `bvars', level(hide)
                    
                        // Show preview
                        // collect preview
                    
                        * edit: replace kendall with p_taub
                        if "`trend'" != "" {
                            *collect layout
                            *collect levels `colvar'
                            *collect levels _dtable_saple_dim
                            quietly collect query autolevels result
                            local auto_results = s(levels)
                            *collect query composite _dtable_test
                            foreach facvar of local fvars {
                                quietly collect levels `facvar'
                                local levels `"`s(levels)'"'
                                gettoken first : levels
                                quietly ktau `colvar' `facvar', stats(taub)
                                collect get p_taub=(r(p)), ///
                                    tags(`colvar'[_dtable_test] var[`first'.`facvar'] _dtable_sample_dim[_hide])
                            }
                            * edit: add loop for 0/1 variables
                            foreach binvar of local bvars {
                                quietly ktau `colvar' `binvar', stats(taub)
                                collect get p_taub=(r(p)), ///
                                    tags(`colvar'[_dtable_test] var[1.`binvar'] _dtable_sample_dim[_hide])
                            }
                            collect style autolevels result `auto_results', clear
                            collect style cell result[p_taub], nformat(%6.3f) minimum(.001)
                            * edit: keep continuous test result
                            collect composite define _dtable_test = p_taub `ctest', trim replace
                        }
                        collect layout
                    
                        /***************************************************************************
                        7) If user requested pdf/docx/xlsx/csv, do collect export
                        ***************************************************************************/
                        if `"`format'"' != "" {
                            collect export "test.`format'", replace
                        }
                    end
                    Here is the command in action without factor variable operators. In the specified variables list
                    1. the 0/1 variables are expensive and odd
                    2. headroom has 8 unique values, but is not integer valued
                    3. rep78 and mod2 meet the factor variable requirements
                    4. all other variables are recognized as continuous
                    Code:
                    sysuse auto
                    gen odd = mod(_n,2)
                    gen mod2 = mod(_n,3)
                    gen expensive = price > 12000
                    
                    mydtable expensive mpg rep78 headroom trunk turn gear odd mod2, by(for) trend
                    Here is the resulting table.
                    Code:
                    -------------------------------------------------------------
                                                         Car origin              
                                           Domestic    Foreign     Total    Test 
                                             N=52       N=22       N=74          
                    -------------------------------------------------------------
                    Mileage (mpg)         19.8 ± 4.7 24.8 ± 6.6 21.3 ± 5.8 <0.001
                    Headroom (in.)         3.2 ± 0.9  2.6 ± 0.5  3.0 ± 0.8  0.011
                    Trunk space (cu. ft.) 14.8 ± 4.3 11.4 ± 3.2 13.8 ± 4.3  0.002
                    Turn circle (ft.)     41.4 ± 4.0 35.4 ± 1.5 39.6 ± 4.4 <0.001
                    Gear ratio             2.8 ± 0.3  3.5 ± 0.3  3.0 ± 0.5 <0.001
                    Repair record 1978                                           
                      1                       2 (4%)     0 (0%)     2 (3%) <0.001
                      2                      8 (17%)     0 (0%)    8 (12%)       
                      3                     27 (56%)    3 (14%)   30 (43%)       
                      4                      9 (19%)    9 (43%)   18 (26%)       
                      5                       2 (4%)    9 (43%)   11 (16%)       
                    mod2                                                         
                      0                     17 (33%)    7 (32%)   24 (32%)  0.831
                      1                     18 (35%)    7 (32%)   25 (34%)       
                      2                     17 (33%)    8 (36%)   25 (34%)       
                    expensive                 4 (8%)     1 (5%)     5 (7%)  0.634
                    odd                     26 (50%)   11 (50%)   37 (50%)  1.000
                    -------------------------------------------------------------
                    Here is an example with factor variable operators. In the specified variables list
                    1. use 1. on variable expensive, saves us from running levelsof for this variable
                    2. use i(1 4). on variable rep78, the 4 causes rep78 to be put in the fvars list, otherwise the levels are ignored
                    3. use i. on odd, which expands to 0b.odd 1.odd so odd will be put in the bvars/b1vars lists
                    4. use 1. on mod2, forces it to be treated as-if it were a 0/1 variable, so it will be put in the bvars/b1vars lists; however, all its values will be used to compute the p_taub p-value
                    Code:
                    mydtable 1.expensive mpg i(1 4).rep78 headroom trunk turn gear i.odd 1.mod2, by(for) trend
                    Here is the resulting table.
                    Code:
                    -------------------------------------------------------------
                                                         Car origin
                                           Domestic    Foreign     Total    Test
                                             N=52       N=22       N=74
                    -------------------------------------------------------------
                    Mileage (mpg)         19.8 ± 4.7 24.8 ± 6.6 21.3 ± 5.8 <0.001
                    Headroom (in.)         3.2 ± 0.9  2.6 ± 0.5  3.0 ± 0.8  0.011
                    Trunk space (cu. ft.) 14.8 ± 4.3 11.4 ± 3.2 13.8 ± 4.3  0.002
                    Turn circle (ft.)     41.4 ± 4.0 35.4 ± 1.5 39.6 ± 4.4 <0.001
                    Gear ratio             2.8 ± 0.3  3.5 ± 0.3  3.0 ± 0.5 <0.001
                    Repair record 1978
                      1                       2 (4%)     0 (0%)     2 (3%) <0.001
                      2                      8 (17%)     0 (0%)    8 (12%)
                      3                     27 (56%)    3 (14%)   30 (43%)
                      4                      9 (19%)    9 (43%)   18 (26%)
                      5                       2 (4%)    9 (43%)   11 (16%)
                    expensive                 4 (8%)     1 (5%)     5 (7%)  0.634
                    odd                     26 (50%)   11 (50%)   37 (50%)  1.000
                    mod2                    18 (35%)    7 (32%)   25 (34%)  0.831
                    -------------------------------------------------------------

                    Comment


                    • #11
                      WOW! It works like a charm, and more so than the good feeling of working code, I am grateful for your generous help and super expertise! Hopefully this ado will be useful to others...

                      Comment

                      Working...
                      X