Table with count, percent for categories and mean & standard deviation for continuous data using Stata17 collect commands & Table Builder

George Brogmus

Join Date: Sep 2017
Posts: 9

Table with count, percent for categories and mean & standard deviation for continuous data using Stata17 collect commands & Table Builder

02 Oct 2021, 13:52

I just purchased Stata 17 and was very excited to start using the Table Builder. It's very straightforward for simple tables, but I'd like to create tables that mix both count/percent data for categorical variables with mean and standard deviation for continuous data. If I can successfully create a table that looks like the following, I can create all the tables I need to. Obviously, I could create the simple table below by running sum or tab for each variable and then manually copying the data into a Word or Excel table, but I'd really like to use the flexibility that seems to be available in the collect command and corresponding Table Builder. Unfortunately, I'm a novice user and would appreciate some help to create the correct structure of commands for my do file. I've read through about the first 30 pages of the STATACUSTOMIZABLE TABLES ANDCOLLECTED RESULTS REFERENCE MANUAL RELEASE 17, but still need help. Here's the example table I'd like to create (instead of values in the table, I've just used N, %, Mean, and SD):

[survey]	Baseline		Year 1		Year 2
[group]	Lifting	Bending	Lifting	Bending	Lifting	Bending
Females	N (%)	N (%)	N (%)	N (%)	N (%)	N (%)
Males	N (%)	N (%)	N (%)	N (%)	N (%)	N (%)
Total	N (%)	N (%)	N (%)	N (%)	N (%)	N (%)
Age	Mean (SD)	Mean (SD)	Mean (SD)	Mean (SD)	Mean (SD)	Mean (SD)
Language
English	N (%)	N (%)	N (%)	N (%)	N (%)	N (%)
Spanish	N (%)	N (%)	N (%)	N (%)	N (%)	N (%)
Hours Sleep	Mean (SD)	Mean (SD)	Mean (SD)	Mean (SD)	Mean (SD)	Mean (SD)
Stress Past Week
None	N (%)	N (%)	N (%)	N (%)	N (%)	N (%)
Some	N (%)	N (%)	N (%)	N (%)	N (%)	N (%)
Most	N (%)	N (%)	N (%)	N (%)	N (%)	N (%)

If I can create the code needed in a Do file, I think I'd be able to use the Table Builder to modify borders, alignment, fonts, etc.

A sample of my data (in long form for this longitudinal study):

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input int idnum byte(group sex survey language stress) double sleep byte age
 1 2 2 0 1 0    5 56
 1 2 2 1 . .    .  .
 1 2 2 2 . .    .  .
 2 2 1 0 1 0    9 45
 2 2 1 1 1 0  6.5 47
 2 2 1 2 1 1    . 48
 3 2 1 0 1 0  6.5 49
 3 2 1 1 . .    .  .
 3 2 1 2 1 0    6 52
11 2 1 0 1 0    6 57
11 2 1 1 . .    .  .
11 2 1 2 . .    .  .
12 2 2 0 1 1    . 53
12 2 2 1 1 1    5 54
12 2 2 2 1 1  4.5 55
18 2 1 0 1 1    6 49
18 2 1 1 1 0    6 50
18 2 1 2 1 0    5 51
19 2 1 0 1 0    6 55
19 2 1 1 1 0    5 57
19 2 1 2 1 0    6 57
30 2 1 0 1 1    4 32
30 2 1 1 1 1    5 33
30 2 1 2 1 1    7 34
31 2 2 0 2 0    . 39
31 2 2 1 2 0    7 20
31 2 2 2 1 0    7 42
32 2 1 0 2 0    7 47
32 2 1 1 2 .    8 48
32 2 1 2 2 0    7 49
33 2 1 0 2 1    5 40
33 2 1 1 1 2    6 41
33 2 1 2 1 0    1 42
34 2 1 0 1 1  5.5 29
34 2 1 1 1 0    6 30
34 2 1 2 . .    .  .
35 2 2 0 2 .    7 62
35 2 2 1 2 0    7 62
35 2 2 2 2 0    8 63
36 2 2 0 2 0    4 40
36 2 2 1 2 0  6.5 43
36 2 2 2 2 .    6 43
37 2 1 0 2 2  7.5 40
37 2 1 1 1 1    8 43
37 2 1 2 1 1    8 43
38 2 2 0 1 1    7 24
38 2 2 1 1 0    5 25
38 2 2 2 1 0    6 26
39 2 2 0 1 1  6.5 40
39 2 2 1 1 0    7 41
39 2 2 2 1 0    6 42
40 2 2 0 1 1    6 30
40 2 2 1 . .    .  .
40 2 2 2 . .    .  .
41 2 1 0 2 .    7 47
41 2 1 1 2 0  8.5 48
41 2 1 2 2 .    . 49
42 2 1 0 1 0    7 48
42 2 1 1 1 0    . 49
42 2 1 2 1 0    . 50
43 2 2 0 2 .   10 51
43 2 2 1 . .    .  .
43 2 2 2 2 .    8 53
44 2 2 0 1 1    7 39
44 2 2 1 1 0  6.5 61
44 2 2 2 1 0    6 61
45 2 1 0 1 0  7.5 29
45 2 1 1 . .    .  .
45 2 1 2 1 2    7 31
46 2 2 0 1 1    8 52
46 2 2 1 . .    .  .
46 2 2 2 . .    .  .
47 2 1 0 1 0    8 21
47 2 1 1 . .    .  .
47 2 1 2 . .    .  .
48 2 2 0 2 .    8 47
48 2 2 1 2 0    8 48
48 2 2 2 2 0    2 49
49 2 1 0 1 0    6 54
49 2 1 1 2 0    7 55
49 2 1 2 1 0    5 56
50 2 2 0 2 0    . 48
50 2 2 1 2 .    . 49
50 2 2 2 2 0    . 50
59 2 1 0 1 1    7 37
59 2 1 1 1 1    7 38
59 2 1 2 1 1    7 39
65 2 1 0 1 0  6.5 34
65 2 1 1 1 0    9 35
65 2 1 2 . .    .  .
66 2 1 0 1 1    7 56
66 2 1 1 . .    .  .
66 2 1 2 1 1    5 58
67 2 1 0 1 2 6.75 46
67 2 1 1 . .    .  .
67 2 1 2 1 0    8 48
68 2 1 0 1 0   10 76
68 2 1 1 1 0    7 77
68 2 1 2 1 0    8 78
end
label values group GroupLabel
label def GroupLabel 2 "Bending", modify
label values sex MaleFemaleLabel
label def MaleFemaleLabel 1 "Male", modify
label def MaleFemaleLabel 2 "Female", modify
label values survey SurveyLabel
label def SurveyLabel 0 "Baseline Survey", modify
label def SurveyLabel 1 "Year1 Survey", modify
label def SurveyLabel 2 "Year2 Survey", modify
label values language LanguageLabel
label def LanguageLabel 1 "English", modify
label def LanguageLabel 2 "Spanish", modify
label values stress PastWkStressLabel
label def PastWkStressLabel 0 "None", modify
label def PastWkStressLabel 1 "Some", modify
label def PastWkStressLabel 2 "Most", modify

Note: If language is missing (.) then that subject did not complete the Survey for that year. All subjects completed (at least) the Baseline survey.

I started with the Do file code:

collect clear
sort survey group sex
collect: by survey group: tab sex if language!=.
collect: by survey group: sum age if language!=.
collect: by survey group: tab language if language!=.
collect: by survey group: sum sleep if language!=.
collect: by survey group: tab stress if language!=.

But that's as far as I got.

Any help would be appreciated!

Kind Regards,
George Brogmus

Tags: table, Table Builder

George Brogmus

Join Date: Sep 2017
Posts: 9

02 Oct 2021, 18:23

I did a bit more reading (Chuck Huber's Blog articles). The following gets me close, but I can't figure out how to get the N and % and means and sd's next to each other.

table (var) (survey group) if language!=., statistic(fvfrequency sex) statistic(fvpercent sex) statistic(mean age) statistic(sd age) statistic(fvfrequency language) statistic(fvpercent language) statistic(mean sleep) statistic(sd sleep) statistic(fvfrequency stress) statistic(fvpercent stress) nototals nformat(%9.0fc fvfrequency fvpercent) sformat("%s%%" fvpercent) nformat(%6.2f mean sd) sformat("(%s)" sd)

collect label levels result fvfrequency "N" fvpercent "Percent" sd "SD", modify

collect style putdocx, layout(autofitcontents) title("Table 1: Descriptive Statistics")
collect export MyTableClose1.docx, as(docx) replace

The result:

Table 1: Descriptive Statistics
	Survey 0(Baseline), 1(Year1), 2(Year2)
	Baseline Survey		Year1 Survey		Year2 Survey
	Intervention-Sham Group		Intervention-Sham Group		Intervention-Sham Group
	Lifting	Bending	Lifting	Bending	Lifting	Bending
Sex=Male
N	48	40	28	23	33	27
Percent	62%	50%	62%	47%	59%	47%
Sex=Female
N	29	40	17	26	23	30
Percent	38%	50%	38%	53%	41%	53%
Age (yrs)
Mean	48.74	46.24	49.20	46.22	51.70	49.28
SD	(9.33)	(10.33)	(9.41)	(10.74)	(8.78)	(9.11)
Language 1(English), 2(Spainish)=English
N	47	52	33	33	36	37
Percent	61%	65%	73%	67%	64%	65%
Language 1(English), 2(Spainish)=Spainish
N	30	28	12	16	20	20
Percent	39%	35%	27%	33%	36%	35%
Sleep Time (hrs) on Average per 24hrs
Mean	6.43	6.73	7.22	6.76	6.81	6.51
SD	(1.55)	(1.74)	(1.94)	(1.47)	(1.64)	(1.61)
Stress level during past week=None
N	46	43	30	30	34	30
Percent	68%	58%	68%	65%	63%	59%
Stress level during past week=Some
N	16	25	9	13	13	17
Percent	24%	34%	20%	28%	24%	33%
Stress level during past week=Most
N	6	6	5	3	7	4
Percent	9%	8%	11%	7%	13%	8%

Thank you!
George Brogmus

Comment

George Brogmus

Join Date: Sep 2017

Posts: 9
#3

03 Oct 2021, 00:54

This got me a bit closer, but still don't understand how to get the N & % and Mean & SD on the same line.

table (var) (survey group) if language!=., statistic(fvfrequency sex) statistic(fvpercent sex) statistic(mean age) statistic(sd age) statistic(fvfrequency language) statistic(fvpercent language) statistic(mean sleep) statistic(sd sleep) statistic(fvfrequency stress) statistic(fvpercent stress) nototals nformat(%9.0fc fvfrequency fvpercent) sformat("%s%%" fvpercent) nformat(%6.2f mean sd) sformat("(%s)" sd)

collect label levels result fvfrequency "N" fvpercent "Percent" sd "SD", modify

collect style header survey, title(hide)

collect style header group, title(hide)

collect label levels survey 0 "Baseline" 1 "Year1" 2 "Year2", modify

collect label levels group 1 "Sham" 2 "Treatment", modify
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4466
#4

03 Oct 2021, 06:18

I'm not sure I completely follow all of the above but I think that what you want is shown in Chuck's 3rd blog and relies on using "collect recode result" where you follow that with something like "mean = column 1" and "sd = column 2" - as I say, a full example is in Chuck's third blog
"the classic table 1"
Comment
George Brogmus

Join Date: Sep 2017

Posts: 9
#5

03 Oct 2021, 11:08

Thank you Rich. I'll read over that in more detail. The first time I tried the column1, column2 code it didn't work. I think I've got to have a better understanding of Dimensions, Levels and Tags and then perhaps I'll be able to structure the code properly. I'll work on that later today and when I get it to work I'll post it.
Comment

Announcement

Table with count, percent for categories and mean & standard deviation for continuous data using Stata17 collect commands & Table Builder

Comment

Comment

Comment

Comment