Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • grouping variables/dims using collect

    I'm playing with Stata 17's collect and I'm getting stumped with grouping together results from same variables. I took a basic example from the manual using an NHANES dataset to create a summary table, and tried adding frequency of non-missing values per variable in addition to an overall frequency.

    Code:
    use https://www.stata-press.com/data/r17/nhanes2l, clear
    collect clear
    table (var) (sex), statistic(frequency) statistic(fvfrequency diabetes) statistic(fvpercent diabetes) statistic(mean age bmi) statistic(sd age bmi) statistic(fvfrequency hlthstat) statistic(fvpercent hlthstat) statistic(mean bpsystol) statistic(sd bpsystol)  statistic(count diabetes age bmi hlthstat bpsystol) nformat(%6.2f mean sd) miss
    collect style header result, level(hide)
    collect style row stack, nobinder spacer
    collect style cell border_block, border(right, pattern(nil))
    collect recode result fvfrequency=mean fvpercent=sd
    collect recode result count=frequency
    collect layout (var) (sex[1 2]#result)
    collect style cell result[sd]#var[age bmi bpsystol], sformat("(%s)")
    collect style cell result[sd]#var[diabetes hlthstat], sformat("%s%%")
    collect style cell result[mean]#var[diabetes hlthstat], nformat(%4.0f)
    collect preview
    The result is the following table (copied from Tables Builder):
    Sex
    Male Female
    4,915 5,436
    Diabetes status
    Not diabetic 4698 95.58% 5152 94.81%
    Diabetic 217 4.42% 282 5.19%
    Age (years) 4,915 47.42 (17.17) 5,436 47.72 (17.26)
    Body mass index (BMI) 4,915 25.51 (4.02) 5,436 25.56 (5.60)
    Health status
    Excellent 1252 25.50% 1155 21.29%
    Very good 1213 24.71% 1378 25.40%
    Good 1340 27.30% 1598 29.45%
    Fair 722 14.71% 948 17.47%
    Poor 382 7.78% 347 6.40%
    Systolic blood pressure 4,915 132.89 (20.99) 5,436 129.07 (25.13)
    Diabetes status 4,915 5,434
    Health status 4,909 5,426

    The count statistic for continuous variables are grouped together with the other statistics, but those for categorical (e.g. diabetes and health status) are treated as separate entities. Does anyone know of an easy solution to this? Also, the statistic(frequency) option creates a _hide that I can't seem to label. I'm still confused about how to handle various dims in collect.

  • #2
    Haven't found a solution to this, and wondering if anyone from StataCorp can chime in.

    Comment


    • #3
      I'm not at my computer right now, but I think the following lines are the source of your problem. I'm not clear on exactly what layout you want but I assume you want all counts to be in the same column. Unfortunately, when you recoded the tags, you forced them into two distinct levels.

      Code:
       
       collect recode result fvfrequency=mean fvpercent=sd collect recode result count=frequency
      Here the counts from continuous variables are tagged with -frequency- while those from factor variables are tagged -mean-. Aligning them requires them to be tagged the same for layout purposes.

      Also _hide is a directive about whether to show level labels or not, and not something to label per se.

      Comment


      • #4
        Leonardo, thank you for the reply. I believe the first line of recode takes the frequency counts of elements and places them under the mean to group them together in under the same column. So the third and fourth columns in the table above show means and sd for continuous, and freq and % for categorical variables. So at least presentation-wise, it is doing what I want. The second recode takes the frequency of non-missing values of a variable and groups them under the count column, which it is doing.

        What's weird is that for the continuous variables (age, BMI), count (recode of frequency), mean, and sd are all grouped together on the same row. For the categorical variables, (diabetes, health), count are not grouped with mean and sd.

        Comment

        Working...
        X