I'm playing with Stata 17's collect and I'm getting stumped with grouping together results from same variables. I took a basic example from the manual using an NHANES dataset to create a summary table, and tried adding frequency of non-missing values per variable in addition to an overall frequency.
The result is the following table (copied from Tables Builder):
The count statistic for continuous variables are grouped together with the other statistics, but those for categorical (e.g. diabetes and health status) are treated as separate entities. Does anyone know of an easy solution to this? Also, the statistic(frequency) option creates a _hide that I can't seem to label. I'm still confused about how to handle various dims in collect.
Code:
use https://www.stata-press.com/data/r17/nhanes2l, clear collect clear table (var) (sex), statistic(frequency) statistic(fvfrequency diabetes) statistic(fvpercent diabetes) statistic(mean age bmi) statistic(sd age bmi) statistic(fvfrequency hlthstat) statistic(fvpercent hlthstat) statistic(mean bpsystol) statistic(sd bpsystol) statistic(count diabetes age bmi hlthstat bpsystol) nformat(%6.2f mean sd) miss collect style header result, level(hide) collect style row stack, nobinder spacer collect style cell border_block, border(right, pattern(nil)) collect recode result fvfrequency=mean fvpercent=sd collect recode result count=frequency collect layout (var) (sex[1 2]#result) collect style cell result[sd]#var[age bmi bpsystol], sformat("(%s)") collect style cell result[sd]#var[diabetes hlthstat], sformat("%s%%") collect style cell result[mean]#var[diabetes hlthstat], nformat(%4.0f) collect preview
Sex | ||||||
Male | Female | |||||
4,915 | 5,436 | |||||
Diabetes status | ||||||
Not diabetic | 4698 | 95.58% | 5152 | 94.81% | ||
Diabetic | 217 | 4.42% | 282 | 5.19% | ||
Age (years) | 4,915 | 47.42 | (17.17) | 5,436 | 47.72 | (17.26) |
Body mass index (BMI) | 4,915 | 25.51 | (4.02) | 5,436 | 25.56 | (5.60) |
Health status | ||||||
Excellent | 1252 | 25.50% | 1155 | 21.29% | ||
Very good | 1213 | 24.71% | 1378 | 25.40% | ||
Good | 1340 | 27.30% | 1598 | 29.45% | ||
Fair | 722 | 14.71% | 948 | 17.47% | ||
Poor | 382 | 7.78% | 347 | 6.40% | ||
Systolic blood pressure | 4,915 | 132.89 | (20.99) | 5,436 | 129.07 | (25.13) |
Diabetes status | 4,915 | 5,434 | ||||
Health status | 4,909 | 5,426 |
The count statistic for continuous variables are grouped together with the other statistics, but those for categorical (e.g. diabetes and health status) are treated as separate entities. Does anyone know of an easy solution to this? Also, the statistic(frequency) option creates a _hide that I can't seem to label. I'm still confused about how to handle various dims in collect.
Comment