Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16

    I'm going to split my response across multiple posts. This post is for
    the first request. The next post is for the second request. Then I'll
    post the entire program with an example showing off both new features.

    ---

    You can add the zeros with something like
    Code:
        collect get fvfrequency=0 fvpercent=0, tags(...)
    it is just a matter of specifying option tags(). If you want to
    add non-integer numeric categories, then you must use the bracket notation.
    For example,
    Code:
        collect get fvfrequency=0 fvpercent=0, tags(cat2[3.5] female[0])
    First we change the syntax specification for option
    categorical() from
    Code:
            CATegorical(varlist)    ///
    to
    Code:
            CATegorical(string)    ///
    and call a custom subroutine that parses the contents of the option into the
    macro categorical containing the categorical variables and a new
    macro that specifies where to add the zero results among the categorical
    variables.
    Code:
        ParseCategorical `categorical'
        local categorical `"`s(varlist)'"'
        local FVzeros `"`s(zeros)'"'
    Here is how I coded this subroutine.
    Code:
    program ParseCategorical, sclass
        syntax [varlist(default=none)] [, ZEROs(string)]
    
        if `:list sizeof varlist' == 0 {
            if `:list sizeof zeros' {
                di as err ///
                "option {bf:zeros()} requires categorical variables"
                exit 198
            }
        }
    
        gettoken spec zeros : zeros , parse(" []")
        while `:length local spec' {
            capture noisily unab names : `spec'
            if c(rc) {
                di as err "in option {bf:zeros()}"
                exit c(rc)
            }
            foreach name of local names {
                if `:list posof "`name'" in varlist' == 0 {
                    di as err "invalid {bf:zeros()} option"
                    di as err ///
            "variable {bf:`name'} not found in list of categorical variables"
                    exit 198
                }
            }
            gettoken open zeros : zeros , parse(" []")
            if `"`open'"' == "" {
                di as err "invalid {bf:zeros()} option"
                di as err `"nothing found where {bf:[} expected"'
                exit 198
            }
            if `"`open'"' != "[" {
                di as err "invalid {bf:zeros()} option"
                di as err `"{bf:`open'} found where {bf:[} expected"'
                exit 198
            }
            gettoken tok zeros : zeros , parse(" []")
            while !inlist(`"`tok'"', "", "]") {
                capture noisily confirm number `tok'
                if c(rc) {
                    di as err "in option {bf:zeros()}"
                    exit c(rc)
                }
                foreach name of local names {
                    local ZEROS `ZEROS' `name'[`tok']
                }
                gettoken tok zeros : zeros , parse(" []")
            }
            if `"`tok'"' != "]" {
                di as err "invalid {bf:zeros()} option"
                di as err `"closing square bracket '{bf:]}' not found"'
                exit 198
            }
            gettoken spec zeros : zeros , parse(" []")
        }
        sreturn local varlist `"`varlist'"'
        sreturn local zeros `"`ZEROS'"'
    end
    After calling this parsing code and calling table we can add the
    specified zero results. Here is a code snippet for how I did this.
    Code:
        quietly collect levelsof `by'
        local by_levels = s(levels)
        foreach l of local by_levels {
            foreach z of local FVzeros {
                collect get fvfrequency=0 fvpercent=0, ///
                    tags(`z' `by'[`l'])
            }
        }
    I noticed a flaw in the autolevels logic for categorical variables. The
    call to
    Code:
             collect style autolevels `x' _hide `s(levels)', clear
    could yield weird level orders if the levels do not all have the same number
    of digits. To fix this I added a short Mata function call to fix the order of
    the levels returned by collect levelsof. Here is the definition of the
    Mata function I added at the end of the ado-file
    Code:
    mata:
    
    void mw_table_sort_cat_levels()
    {
            vector    levels
        real    vector    sel
        real    vector    order
    
        levels = tokens(st_global("s(levels)"))
        sel = levels :!= "_hide"
        order = order(strtoreal(select(levels,sel))', 1)
        levels = levels[order]
        st_global("s(levels)", invtokens(levels))
    }
    
    end
    and here is how I modified the loop over the categorical variables
    Code:
        foreach x of local categorical {
            quietly tabulate `x' `by', chi2
            collect get nobs=(r(N)) p=(r(p)), tag(`x'[_hide])
            collect style header `x', title(label)
            quietly collect levelsof `x'
            mata: mw_table_sort_cat_levels()
            collect style autolevels `x' _hide `s(levels)', clear
        }

    Comment


    • #17

      Adding a second row for continuous results is somewhat trickier, but it
      is possible with composite results and some extra tags.

      As with the zeros, we change the syntax specification for option
      continuous() from
      Code:
              CONTinuous(varlist)    ///
      to
      Code:
              CONTinuous(string)    ///
      and call a custom subroutine that parses the contents of the option into
      the macro continuous containing the continuous variables and a
      new macro that indicates a second row of range statistics is requested.
      Code:
          ParseContinuous `continuous'
          local continuous `"`s(varlist)'"'
          local CVranges `"`s(ranges)'"'
      Here is how I coded this subroutine.
      Code:
      program ParseContinuous, sclass
          syntax [varlist(default=none)] [, RANGEs]
      
          if `:list sizeof varlist' == 0 {
              if `:list sizeof ranges' {
                  di as err ///
                  "option {bf:ranges} requires continuous variables"
                  exit 198
              }
          }
      
          sreturn local varlist `"`varlist'"'
          sreturn local ranges `"`ranges'"'
      end
      After calling this parsing code we need to add the range statistics in the
      call to table, provided the ranges were requested. Here is how I
      did this.
      Code:
           if `:list sizeof continuous' {
               local CVopts    statistic(mean `continuous') ///
                       statistic(sd `continuous')
              if `:list sizeof CVranges' {
                  local CVopts `CVopts'    ///
                      statistic(min `continuous') ///
                      statistic(max `continuous') ///
                      statistic(iqr `continuous')
              }
           }
      Adding a row for separate results for the cotinuous variables posed some
      interesting challenges. I decided to add a new dimension named
      __CONT where I placed the row header information and used to tag
      the continuous variable results so that they can show up in separate
      rows while still being specified as columns in the layout. I did this
      by replacing
      Code:
          collect composite define col1 = mean fvfrequency
          collect composite define col2 = sd fvpercent
      with
      Code:
          if `:length local CVranges' {
              collect composite define rangei = min max, trim
              collect style cell result[rangei], sformat("[%s]")
              collect style cell result[iqr], sformat("(%s)")
              local i 0
              foreach v of local continuous {
                  local ++i
                  quietly collect addtags __CONT[v`i'name], ///
                      fortags(var[`v']#result[mean sd p nobs])
                  quietly collect addtags __CONT[v`i'ranges], ///
                      fortags(var[`v']#result[min max iqr])
      
                  local lab : variable label `v'
                  if `"`lab'"' == "" {
                      local lab `v'
                  }
      
                  collect label levels __CONT ///
                      v`i'name `"`lab'"' ///
                      v`i'ranges "Range (IQR)"
              }
              collect style header __CONT, title(hide)
              local contspec __CONT
              local col1extra rangei
              local col2extra iqr
          }
          collect composite define col1 = mean fvfrequency `col1extra'
          collect composite define col2 = sd fvpercent `col2extra'

      Comment


      • #18

        Here is the fully modified program. You may notice I also changed how I
        collect results from anova. This new version collects only the
        results of interest.
        Code:
        *! version 1.0.1  23jul2023
        program mw_table
            version 17
        
            syntax ,            ///
                by(string asis)        ///
            [                ///
                BINary(varlist)        ///
                CATegorical(string)    ///
                CONTinuous(string)    ///
                GROUPed(string asis)    ///
                *            ///
            ]
        
            capture noisily ParseByOption `by'
            if c(rc) {
                di as err "in option {bf:by()}"
                exit c(rc)
            }
            local by = s(by)
            local byfirst "`s(first)'"
            local bylabel "`s(label)'"
        
            // Do not allow variables to be specified in more than one
            // option.
        
            ParseCategorical `categorical'
            local categorical `"`s(varlist)'"'
            local FVzeros `"`s(zeros)'"'
        
            ParseContinuous `continuous'
            local continuous `"`s(varlist)'"'
            local CVranges `"`s(ranges)'"'
        
            local duplist binary categorical continuous
            local k_duplist : list sizeof duplist
        
            forval i = 1/`k_duplist' {
                local opt1 : word `i' of `duplist'
                forval j = `=`i'+1'/`k_duplist' {
                    local opt2 : word `j' of `duplist'
                    local both : list `opt1' & `opt2'
                    CheckDupVars "`both'" `opt1'() `opt2'()
                }
            }
        
            // Parse -grouped()- options.
            // Check that grouped variables are not specified in the other
            // options.
            // Check that grouped names are not specified in
            // the other options.
            // Remaining options go to -table-.
        
            local gid 0
            while `:length local grouped' {
                local ++gid
                capture noisily ParseGroupOption `grouped'
                if c(rc) {
                    di as err "in option {bf:grouped()}"
                    exit c(rc)
                }
                local group`gid'vars = s(varlist)
                local group`gid'name = s(name)
                local group`gid'label = s(label)
                forval i = 1/`k_duplist' {
                    local opt1 : word `i' of `duplist'
                    local both : list `opt1' & group`gid'vars
                    CheckDupVars "`both'" `opt1'() group()
                }
                local both : list allgroupvars & group`gid'vars
                CheckDupVars "`both'" group() group()
                CheckDupNames `group`gid'name' `allgroupnames'
                local allgroupvars `allgroupvars' `group`gid'vars'
                local allgroupnames `allgroupnames' `group`gid'name'
                local 0 `", `options'"'
                syntax [, GROUPed(string asis) * ]
            }
            local k_grouped = `gid'
        
            // Check that grouped names are not also being used as
            // variables.
        
            local both : list allgroupnames & binary
            CheckNameVarConflict "`both'" binary()
            local both : list allgroupnames & categorical
            CheckNameVarConflict "`both'" categorical()
            local both : list allgroupnames & continuous
            CheckNameVarConflict "`both'" continuous()
            local both : list allgroupnames & allgroupvars
            CheckNameVarConflict "`both'" group()
        
            // Build the call to -table-.
        
            if `:list sizeof continuous' {
                local CVopts    statistic(mean `continuous') ///
                        statistic(sd `continuous')
                if `:list sizeof CVranges' {
                    local CVopts `CVopts'    ///
                        statistic(min `continuous') ///
                        statistic(max `continuous') ///
                        statistic(iqr `continuous')
                }
            }
            local fvlist `binary' `categorical' `allgroupvars'
            if `:list sizeof fvlist' {
                local FVopts    statistic(fvfrequency `fvlist') ///
                        statistic(fvpercent `fvlist')
            }
        
            quietly table () (`by' result), `CVopts' `FVopts' `options'
        
            quietly collect levelsof `by'
            local by_levels = s(levels)
            foreach l of local by_levels {
                foreach z of local FVzeros {
                    collect get fvfrequency=0 fvpercent=0, ///
                        tags(`z' `by'[`l'])
                }
            }
        
            if `:list sizeof continuous' {
                // -anova- needs a numercial by variable.
                local bytype : type `by'
                if substr("`bytype'",1,3) == "str" {
                    tempvar numby
                    encode `by', generate(`numby')
                }
                else {
                    local numby `by'
                }
        
                // continuous variables layout specification
                local contspec var
                collect style autolevels var `continuous', clear
            }
        
            foreach x of local continuous {
                quietly anova `x' `numby'
                collect get nobs=(e(N)) p=Ftail(e(df_m),e(df_r),e(F)) ///
                    , tag(var[`x'])
            }
        
            foreach x of local categorical {
                quietly tabulate `x' `by', chi2
                collect get nobs=(r(N)) p=(r(p)), tag(`x'[_hide])
                collect style header `x', title(label)
                quietly collect levelsof `x'
                mata: mw_table_sort_cat_levels()
                collect style autolevels `x' _hide `s(levels)', clear
            }
        
            foreach x of local binary {
                quietly tabulate `x' `by', chi2
                collect get nobs=(r(N)) p=(r(p)), tag(var[1.`x'])
                collect style header `x'[1], title(label) level(hide)
                collect style autolevels `x' 1, clear
            }
        
            foreach x of local allgroupvars {
                quietly tabulate `x' `by', chi2
                collect get nobs=(r(N)) p=(r(p)), tag(var[1.`x'])
                collect style autolevels `x' 1, clear
            }
        
            forval i = 1/`k_grouped' {
                local vars : copy local group`i'vars
                local name : copy local group`i'name
                local label : copy local group`i'label
                local j 0
                foreach x of local vars {
                    local ++j
                    local lab : variable label `x'
                    if `"`lab'"' == "" {
                        local lab `x'
                    }
                    quietly collect remap `x'[1] = `name'[`j']
                    collect label levels `name' `j' `"`lab'"', modify
                }
                collect label dim `name' `"`label'"', modify
            }
        
            // Define some composites for the columns. This allows us to
            // let -table- handle result formats.
        
            if `:length local CVranges' {
                collect composite define rangei = min max, trim
                collect style cell result[rangei], sformat("[%s]")
                collect style cell result[iqr], sformat("(%s)")
                local i 0
                foreach v of local continuous {
                    local ++i
                    quietly collect addtags __CONT[v`i'name], ///
                        fortags(var[`v']#result[mean sd p nobs])
                    quietly collect addtags __CONT[v`i'ranges], ///
                        fortags(var[`v']#result[min max iqr])
        
                    local lab : variable label `v'
                    if `"`lab'"' == "" {
                        local lab `v'
                    }
        
                    collect label levels __CONT ///
                        v`i'name `"`lab'"' ///
                        v`i'ranges "Range (IQR)"
                }
                collect style header __CONT, title(hide)
                local contspec __CONT
                local col1extra rangei
                local col2extra iqr
            }
            collect composite define col1 = mean fvfrequency `col1extra'
            collect composite define col2 = sd fvpercent `col2extra'
        
            // p-value styles
        
            collect style cell result[p], nformat(%6.2f)
            collect label levels result p "p-value", modify
        
            // sample size styles
        
            collect style cell result[nobs], nformat(%18.0fc)
            collect label levels result nobs "N", modify
        
            // header styles
        
            collect style header `by', title(hide)
            collect style header result[col1 col2], level(hide)
            collect style row stack, nobinder spacer
        
            // border style
        
            collect style cell border_block, border(right, pattern(nil))
        
            // handle by Total styles
        
            if "`byfirst'" != "" {
                quietly collect levels `by'
                collect style autolevels `by' .m `s(levels)', clear
            }
            if `"`bylabel'"' != "" {
                collect label levels `by' .m `"`bylabel'"', modify
            }
        
            collect layout ///
                (`contspec' `categorical' `allgroupnames' `binary') ///
                (`by'#result[col1 col2] result[p nobs])
        end
        
        program ParseCategorical, sclass
            syntax [varlist(default=none)] [, ZEROs(string)]
        
            if `:list sizeof varlist' == 0 {
                if `:list sizeof zeros' {
                    di as err ///
                    "option {bf:zeros()} requires categorical variables"
                    exit 198
                }
            }
        
            gettoken spec zeros : zeros , parse(" []")
            while `:length local spec' {
                capture noisily unab names : `spec'
                if c(rc) {
                    di as err "in option {bf:zeros()}"
                    exit c(rc)
                }
                foreach name of local names {
                    if `:list posof "`name'" in varlist' == 0 {
                        di as err "invalid {bf:zeros()} option"
                        di as err ///
                "variable {bf:`name'} not found in list of categorical variables"
                        exit 198
                    }
                }
                gettoken open zeros : zeros , parse(" []")
                if `"`open'"' == "" {
                    di as err "invalid {bf:zeros()} option"
                    di as err `"nothing found where {bf:[} expected"'
                    exit 198
                }
                if `"`open'"' != "[" {
                    di as err "invalid {bf:zeros()} option"
                    di as err `"{bf:`open'} found where {bf:[} expected"'
                    exit 198
                }
                gettoken tok zeros : zeros , parse(" []")
                while !inlist(`"`tok'"', "", "]") {
                    capture noisily confirm number `tok'
                    if c(rc) {
                        di as err "in option {bf:zeros()}"
                        exit c(rc)
                    }
                    foreach name of local names {
                        local ZEROS `ZEROS' `name'[`tok']
                    }
                    gettoken tok zeros : zeros , parse(" []")
                }
                if `"`tok'"' != "]" {
                    di as err "invalid {bf:zeros()} option"
                    di as err `"closing square bracket '{bf:]}' not found"'
                    exit 198
                }
                gettoken spec zeros : zeros , parse(" []")
            }
            sreturn local varlist `"`varlist'"'
            sreturn local zeros `"`ZEROS'"'
        end
        
        program ParseContinuous, sclass
            syntax [varlist(default=none)] [, RANGEs]
        
            if `:list sizeof varlist' == 0 {
                if `:list sizeof ranges' {
                    di as err ///
                    "option {bf:ranges} requires continuous variables"
                    exit 198
                }
            }
        
            sreturn local varlist `"`varlist'"'
            sreturn local ranges `"`ranges'"'
        end
        
        program ParseByOption, sclass
            syntax varname [, first label(string)]
            sreturn local by "`varlist'"
            sreturn local first `"`first'"'
            sreturn local label `"`label'"'
        end
        
        program ParseGroupOption, sclass
            syntax varlist , name(name) [label(string)]
            sreturn local varlist `"`varlist'"'
            sreturn local name `"`name'"'
            sreturn local label `"`label'"'
        end
        
        program CheckDupVars
            args vars opt1 opt2
        
            local k : list sizeof vars
            if `k' == 0 {
                exit
            }
            if `k' > 1 {
                local s s
            }
            if "`opt1'" == "`opt2'" {
                di as err ///
                "variable`s' duplicated in separate {bf:`opt1'} options"
            }
            else {
                di as err ///
                "variable`s' duplicated in options {bf:`opt1'} and {bf:`opt2'}"
            }
            di as err "{p}offending variable`s': {bf:`vars'}{p_end}"
            exit 198
        end
        
        program CheckDupNames
            gettoken first rest : 0
            if `:list posof "`first'" in rest' == 0 {
                exit
            }
            di as err "name {bf:`first'} used in more than one {bf:group()} option"
            exit 198
        end
        
        program CheckNameVarConflict
            args found opt
        
            local k : list sizeof found
            if `k' == 0 {
                exit
            }
            gettoken first : found
            di as err "{p}"
            di as err "{bf:grouped()} suboption {bf:name(`first')} is not allowed;{break}"
            di as err "variable {bf:`first'} was specified in option {bf:`opt'}"
            di as err "{p_end}"
            exit 198
        end
        
        mata:
        
        void mw_table_sort_cat_levels()
        {
                vector    levels
            real    vector    sel
            real    vector    order
        
            levels = tokens(st_global("s(levels)"))
            sel = levels :!= "_hide"
            order = order(strtoreal(select(levels,sel))', 1)
            levels = levels[order]
            st_global("s(levels)", invtokens(levels))
        }
        
        end
        
        exit
        Using the data from my last example I composed a do-file that does some
        syntax and error message checks, adds some variable labels to verify the
        new __CONT dimension picks them up, and call mw_table with
        the new options for categorical variable levels with zero observations
        and ranges for continuous variables.

        Code:
        rcof "noisily mw_table, by(female) categorical(, zero(cat1[0])) " == 198
        rcof "noisily mw_table, by(female) categorical(cat?, zero(dude[0])) " == 111
        rcof "noisily mw_table, by(female) categorical(cat2, zero(cat1[0])) " == 198
        rcof "noisily mw_table, by(female) categorical(cat?, zero(cat1)) " == 198
        rcof "noisily mw_table, by(female) categorical(cat?, zero(cat1 foo)) " == 198
        rcof "noisily mw_table, by(female) categorical(cat?, zero(cat1[a])) " == 7
        rcof "noisily mw_table, by(female) categorical(cat?, zero(cat1[1)) " == 198
        
        label variable cont1 "Measure 1"
        label variable cont2 "Measure 2"
        label variable cont3 "Measure 3"
        
        mw_table, ///
            by(female, first label(Overall)) ///
            binary(bin?) ///
            categorical(cat?, ///
                zeros(cat1[2.5] cat2[0] cat3[9 16]) ///
            ) ///
            continuous(cont?, ranges) ///
            grouped(nib?, name(nibs) label(Group1 indicators)) ///
            grouped(bib?, name(bibs) label(Group2 indicators)) ///
            nformat(%6.1f mean sd min max iqr) ///
            nformat(%6.1f fvpercent percent) ///
            sformat("%s%%" fvpercent percent) ///
            sformat("(%s)" sd) ///
            name(mytable)
        Here is the resulting table. You'll notice that I enclose the range
        values in square brackes instead of using a dash. I'm reluctant to use a
        dash since continuous variables could have negative values (as in the
        example).
        Code:
        ---------------------------------------------------------------------------------------------------
                                 Overall                Male                 Female         p-value       N
        ---------------------------------------------------------------------------------------------------
        Measure 1                  5.0   (2.1)          4.5   (2.1)           5.5   (2.0)      0.00     956
        Range (IQR)          [1.0 9.0]   (3.5)    [1.0 8.0]   (3.6)     [2.0 9.0]   (3.5)
        Measure 2                  0.4   (3.1)         -0.1   (3.0)           0.9   (3.1)      0.00   1,000
        Range (IQR)        [-8.6 12.7]   (4.2)   [-8.2 8.6]   (4.2)   [-8.6 12.7]   (4.0)
        Measure 3                  0.0   (1.0)          0.1   (1.0)          -0.0   (1.1)      0.20   1,000
        Range (IQR)         [-3.0 3.0]   (1.4)   [-2.4 3.0]   (1.3)    [-3.0 2.8]   (1.4)
        
        cat1                                                                                   0.00   1,000
          1                         65    6.5%           65   13.0%             0    0.0%
          2                        414   41.4%          208   41.5%           206   41.3%
          2.5                        0    0.0%            0    0.0%             0    0.0%
          3                        521   52.1%          228   45.5%           293   58.7%
        
        cat2                                                                                   0.05   1,000
          0                          0    0.0%            0    0.0%             0    0.0%
          1                        272   27.2%          143   28.5%           129   25.9%
          2                        228   22.8%           99   19.8%           129   25.9%
          3                        235   23.5%          113   22.6%           122   24.4%
          4                        265   26.5%          146   29.1%           119   23.8%
        
        cat3                                                                                   0.24   1,000
          9                          0    0.0%            0    0.0%             0    0.0%
          11                       213   21.3%          104   20.8%           109   21.8%
          12                       194   19.4%           88   17.6%           106   21.2%
          13                       194   19.4%           98   19.6%            96   19.2%
          14                       198   19.8%           97   19.4%           101   20.2%
          15                       201   20.1%          114   22.8%            87   17.4%
          16                         0    0.0%            0    0.0%             0    0.0%
        
        Group1 indicators
          nib1                     290   29.0%          152   30.3%           138   27.7%      0.35   1,000
          nib2                     215   21.5%          143   28.5%            72   14.4%      0.00   1,000
          nib3                     264   26.4%          130   25.9%           134   26.9%      0.75   1,000
        
        Group2 indicators
          bib1                     710   71.0%          349   69.7%           361   72.3%      0.35   1,000
          bib2                     785   78.5%          358   71.5%           427   85.6%      0.00   1,000
          bib3                     736   73.6%          371   74.1%           365   73.1%      0.75   1,000
        
        bin1                       645   64.5%          284   56.7%           361   72.3%      0.00   1,000
        
        bin2                       496   49.6%          257   51.3%           239   47.9%      0.28   1,000
        ---------------------------------------------------------------------------------------------------

        Comment


        • #19
          Thanks so much, Jeff. I've got all of this working perfectly.

          Comment

          Working...
          X