Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Customizing table labels with 'collect label levels results' works as desired with summary statistics but not with ratio statistics

    Hello.

    I am using Stata 18 on a MacOS 10.15.7. The collect command is supposed to allow modification of labels without modifying the variable labels. This is described in the Stata blog Customizing Tables part 2

    Following these commands from the blog which generate summary (mean, sd) and ratio (percent) stats, I am unable to recreate desired modified table labels where only the modified labels appear. Instead, I get the modified label plus the variable label (from the dataset) appearing.
    However, if I restrict the code to summary stats such as mean and SD. The customized labelling works. Once I add ratio stats like proportion or percent, the customized labels AND the variable labels appear.

    I am not sure if I am using collect incorrectly or it is not responding as it should.

    This is the code from the blog, slightly modified.

    To create the table :

    Code:
    webuse nhanes2l , clear
    collect clear
    collect dims
    
    table (sex) (highbp), ///
    statistic(frequency) ///
    statistic(percent) ///
    statistic(mean age) ///
    statistic(sd age) ///
    nototals
    This is the table as generated
    Code:
    -----------------------------------------------
                           |   High blood pressure
                           |          0           1
    -----------------------+-----------------------
    Sex                    |                       
      Male                 |                       
        Frequency          |      2,611       2,304
        Percent            |      25.22       22.26
        Mean               |                       
          Age (years)      |    42.8625    52.59288
        Standard deviation |                       
          Age (years)      |    16.9688    15.88326
      Female               |                       
        Frequency          |      3,364       2,072
        Percent            |      32.50       20.02
        Mean               |                       
          Age (years)      |   41.62366    57.61921
        Standard deviation |                       
          Age (years)      |   16.59921    13.25577
    -----------------------------------------------
    I can modify the labels by :

    Code:
    collect label list result, all
    collect label levels result frequency "Freq." ///
        mean      "Mean (Age)" ///
        percent   "Percent" ///
        sd         "SD (Age)" ///
        , modify
    This shows the desired labels correctly :

    Code:
    collect label list result, all
    The resulting table is not as expected because the labels become the collect label plus the variable label Mean (Age) Age (years) and SD (Age) Age (years)

    Code:
    . collect label list result
    
      Collection: Table
       Dimension: result
           Label: Result
    Level labels:
            mean  Mean (Age)
         percent  Percent
              sd  SD (Age)
    
    . collect preview
    
    ------------------------------------------
                      |   High blood pressure
                      |          0           1
    ------------------+-----------------------
    Sex               |                       
      Male            |                       
        Percent       |      25.22       22.26
        Mean (Age)    |                       
          Age (years) |    42.8625    52.59288
        SD (Age)      |                       
          Age (years) |    16.9688    15.88326
      Female          |                       
        Percent       |      32.50       20.02
        Mean (Age)    |                       
          Age (years) |   41.62366    57.61921
        SD (Age)      |                       
          Age (years) |   16.59921    13.25577
    ------------------------------------------
    The variable label is seen here

    Code:
    describe age
    
    Variable      Storage   Display    Value
        name         type    format    label      Variable label
    ------------------------------------------------------------------------------------------------------------------------------------
    age             byte    %9.0g                 Age (years)
    If I restrict the code to only summary statistics and not ratio statistics, I get the desired labels : Mean (Age) and SD (Age)

    Code:
    webuse nhanes2l , clear
    collect clear
    collect dims
    
    qui table (sex) (highbp), ///
        statistic(mean age) ///
        statistic(sd age) ///
        nototals
    
    collect label levels result ///
        mean      "Mean (Age)" ///
        sd         "SD (Age)" , modify
    The labels are correctly modified and the resulting table has the correct labels.

    Code:
    . collect label list result
      Collection: Table
       Dimension: result
           Label: Result
    Level labels:
            mean  Mean (Age)
              sd  SD (Age)
    
    . collect preview
    
    ---------------------------------------
                   |   High blood pressure
                   |          0           1
    ---------------+-----------------------
    Sex            |                       
      Male         |                       
        Mean (Age) |    42.8625    52.59288
        SD (Age)   |    16.9688    15.88326
      Female       |                       
        Mean (Age) |   41.62366    57.61921
        SD (Age)   |   16.59921    13.25577
    ---------------------------------------
    I am able to recreate the error if I add any ratio statistic, this time the statistic (proportion).
    The combined labels of collect and the variable label Mean (Age) Age (years) and SD (Age) Age (years) appear again in the table.

    Code:
    webuse nhanes2l , clear
    collect clear
    collect dims
    
    qui table (sex) (highbp), ///
        statistic(proportion) /// /*THIS WAS ADDED*/
        statistic(mean age) ///
        statistic(sd age) ///
        nototals
    
    collect label levels result ///
        mean      "Mean (Age)" ///
        sd         "SD (Age)" , modify
    You can see that the labels are modified correctly.
    Code:
    collect label list result
    
      Collection: Table
       Dimension: result
           Label: Result
    Level labels:
            mean  Mean (Age)
      proportion  Proportion
              sd  SD (Age)
    However, the table once again have both the collect labels and the appended variable labels.

    Code:
    collect preview
    
    ------------------------------------------
                      |   High blood pressure
                      |          0           1
    ------------------+-----------------------
    Sex               |                       
      Male            |                       
        Proportion    |      .2522       .2226
        Mean (Age)    |                       
          Age (years) |    42.8625    52.59288
        SD (Age)      |                       
          Age (years) |    16.9688    15.88326
      Female          |                       
        Proportion    |       .325       .2002
        Mean (Age)    |                       
          Age (years) |   41.62366    57.61921
        SD (Age)      |                       
          Age (years) |   16.59921    13.25577
    ------------------------------------------
    I don't see why it should be behaving differently depending on type of statistic requested. Is there a way to get the desired labels only using collect without the appended variable labels?

  • #2
    table shows the variable names (labels) in the header when more than one variable is specified, or if you specify a statistic that requires a variable with a statistic that does not. In your example, statistics mean and sd require a variable name, but statistic proportion is specified without.

    You can hide the levels of dimension var from the header.
    Code:
    collect style header var, level(hide)
    Here is the resulting table
    Code:
    ---------------------------------------
                   |   High blood pressure 
                   |          0           1
    ---------------+-----------------------
    Sex            |                       
      Male         |                       
        Freq.      |      2,611       2,304
        Percent    |      25.22       22.26
        Mean (Age) |    42.8625    52.59288
        SD (Age)   |    16.9688    15.88326
      Female       |                       
        Freq.      |      3,364       2,072
        Percent    |      32.50       20.02
        Mean (Age) |   41.62366    57.61921
        SD (Age)   |   16.59921    13.25577
    ---------------------------------------
    The is the full set of commands I used to produce the above table.
    Code:
    webuse nhanes2l , clear
    
    table (sex) (highbp), ///
    statistic(frequency) ///
    statistic(percent) ///
    statistic(mean age) ///
    statistic(sd age) ///
    nototals
    
    collect label list result, all
    collect label levels result frequency "Freq." ///
        mean      "Mean (Age)" ///
        percent   "Percent" ///
        sd         "SD (Age)" ///
        , modify
    collect label list result, all
    
    * use this command to see what dimensions are specified in the layout
    collect layout
    
    * hide the levels of var from the header
    collect style header var, level(hide)
    collect preview

    Comment


    • #3
      Thank you Jeff Pitblado (StataCorp) for the explanation and pointing that out. I am able to recreate that table now following the code you provided.

      Comment

      Working...
      X