How to --fsum-- regarding to different groups

Zhang_Lu

Join Date: Oct 2014

Posts: 155
#1

How to --fsum-- regarding to different groups

15 Dec 2014, 02:05

The title may be unclear, but the idea is simple: with built-in command --tabstat-- we cannot disply variables' labels, with --fsum-- we can do this but there's no --by() option with which I can " speciﬁes that the statistics be displayed separately for each unique value of varname;" , There's an option named cat() in --fsum-- which confused me, is it my problem's answer?
I experienced the following command
char a2018[tlabel] "a2018 (% local_Hukou)"
fsum a2018 a3012 a3020 a3022,stat(mean sd) cat(a2018) uselabel
in which Local_Hukou is a previously defined value label for a2018, but it's not what I want, the Local_Hukou appeared in the first column like other labels rather than in the first row like when using -tabstat-- with by() option. Should I replace a2018 in the first line with Local_Hukou ? I don't know, the example in the help file of -fsum- is confusing me
char sex[tlabel] "Sex (% male)"
fsum age sex ethorig pcs,mcat(ethorig) cat(sex)
I'm not familiar with the char staff. Could someone give me a clue. Thanks a lot

Last edited by Zhang_Lu; 15 Dec 2014, 02:07.
Tags: None
Fred Wolfe

Join Date: Mar 2014

Posts: 10
#2

15 Dec 2014, 04:15

I wrote -fsum-many years ago, maybe 10 years ago? . I think it is a great command for my needs, and I use it as a substitute for Stata's -summ- almost always.

-fsum- does support -by-. Somehow I left this option out of the help file, perhaps because there was a little bug in it.

. bys sex: fsum age edlevel hsgrad

-----------------------------------------------------------------------------------------------------------------------------
-> sex = Female

Variable | N Mean SD Min Max
--------------------------+---------------------------------------------
Age (years) | 271594 62.53 13.03 9.31 103.77
Education (years) | 271594 13.67 2.32 0.00 17.00
High school graduate (%) | 271594 92.01

-----------------------------------------------------------------------------------------------------------------------------
-> sex = Male

Variable | N Mean SD Min Max
--------------------------+---------------------------------------------
Age (years) | 62812 65.86 11.58 7.19 102.92
Education (years) | 62812 13.78 2.50 0.00 17.00
High school graduate (%) | 62812 90.44

-----------------------------------------------------------------------------------------------------------------------------
-> sex = .

Variable | N Mean SD Min Max
--------------------------+---------------------------------------------
Age (years) | 0
Education (years) | 0
High school graduate (%) | 0
.

I never was able to get rid of the missing (.) category. If someone wants to take up the work on this command and fix this problem, I'd welcome it. I'll fix the help file after a while.

-tlabel- came from the idea that there should be, optionally, several kinds of label. The regular Stata label and another (tlabel) that was more formatted for other readers and/or publication. As an example for the variable sex:

sex: label: Gender tlabel: Gender (% male) clabel: sex value label: sexlabel

Notice that the tlabel provides additional information. If the program -fsum- sees a "%" if the labels, it assumes the result should be displayed as a percentage and multiplies it by 100.
BTW, the -cat() option tells -fsum- that he variables is a categorical variable and that the results should be displayed for each category.

Making tlabels is kind of a pain, so I wrote a program to do this simply. I always use this program instead of Stata's label ... It is called nlabel. I guess I should post this to the archive soon
N label is a perfect labeling command for implementing standard or tlabels without extra work. Here are some axamples of -nlabel- at work.

gen ss= depev + fatigsev + muspain + cogsever + insomnia
nlabel ss, label(Symptom score) ct k
gen ppsd = jointsum + ss
nlabel ppsd, label(NHIS PSD tottal adhoc) ct k

gen notwork = inlist(wrklyr4,1,2,3) if wrklyr4 <7
nlabel notwork, l(Not working) a((%)) k

It is just about the same as -label ...-

Here is an example of how it works:

. des sex

storage display value
variable name type format label variable label
-----------------------------------------------------------------------------------------------------------------------------
sex byte %8.0g sex Gender

. nlabel sex, l(Gender) a((% male))

. nlabel sex
variable: sex label: Gender tlabel: Gender (% male) clabel: value label: sex

. fsum sex,v

Variable | N Mean SD Min Max
-----------------+---------------------------------------------
Gender (% male) | 334406 18.78 sex

Until I get a chance to post -nlabel- to the SSC, I can send it to people via email,

Fred
[email protected]
Comment
Zhang_Lu

Join Date: Oct 2014

Posts: 155
#3

16 Dec 2014, 19:27

Hey, Fred, thanks for your valuable clarification, However, there might be some abigumous in my expression, what I want is some thing like this

You see that the category variable is on the heading row and other variable labels on the first column. Using --tabstat-- I can basically mimic it , though I can only show the variable names rather than labels as I mentioned. I believe --fsum-- is a strengthen of --tabstat-- then I can I do this in --fsum-- with some easy option like by() in --tabstat--. You see the by() I mean is not the prefix bys
I want a single panel not a separated one you illustrated . Thank you so much.
1 like
Comment

Announcement

How to --fsum-- regarding to different groups

Comment

Comment