Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to --fsum-- regarding to different groups

    The title may be unclear, but the idea is simple: with built-in command --tabstat-- we cannot disply variables' labels, with --fsum-- we can do this but there's no --by() option with which I can " specifies that the statistics be displayed separately for each unique value of varname;" , There's an option named cat() in --fsum-- which confused me, is it my problem's answer?
    I experienced the following command
    char a2018[tlabel] "a2018 (% local_Hukou)"
    fsum a2018 a3012 a3020 a3022,stat(mean sd) cat(a2018) uselabel

    in which Local_Hukou is a previously defined value label for a2018, but it's not what I want, the Local_Hukou appeared in the first column like other labels rather than in the first row like when using -tabstat-- with by() option. Should I replace a2018 in the first line with Local_Hukou ? I don't know, the example in the help file of -fsum- is confusing me
    char sex[tlabel] "Sex (% male)"
    fsum age sex ethorig pcs,mcat(ethorig) cat(sex)

    I'm not familiar with the char staff. Could someone give me a clue. Thanks a lot
    Last edited by Zhang_Lu; 15 Dec 2014, 02:07.

  • #2
    I wrote -fsum-many years ago, maybe 10 years ago? . I think it is a great command for my needs, and I use it as a substitute for Stata's -summ- almost always.

    -fsum- does support -by-. Somehow I left this option out of the help file, perhaps because there was a little bug in it.

    . bys sex: fsum age edlevel hsgrad

    -----------------------------------------------------------------------------------------------------------------------------
    -> sex = Female

    Variable | N Mean SD Min Max
    --------------------------+---------------------------------------------
    Age (years) | 271594 62.53 13.03 9.31 103.77
    Education (years) | 271594 13.67 2.32 0.00 17.00
    High school graduate (%) | 271594 92.01

    -----------------------------------------------------------------------------------------------------------------------------
    -> sex = Male

    Variable | N Mean SD Min Max
    --------------------------+---------------------------------------------
    Age (years) | 62812 65.86 11.58 7.19 102.92
    Education (years) | 62812 13.78 2.50 0.00 17.00
    High school graduate (%) | 62812 90.44

    -----------------------------------------------------------------------------------------------------------------------------
    -> sex = .

    Variable | N Mean SD Min Max
    --------------------------+---------------------------------------------
    Age (years) | 0
    Education (years) | 0
    High school graduate (%) | 0
    .


    I never was able to get rid of the missing (.) category. If someone wants to take up the work on this command and fix this problem, I'd welcome it. I'll fix the help file after a while.


    -tlabel- came from the idea that there should be, optionally, several kinds of label. The regular Stata label and another (tlabel) that was more formatted for other readers and/or publication. As an example for the variable sex:

    sex: label: Gender tlabel: Gender (% male) clabel: sex value label: sexlabel

    Notice that the tlabel provides additional information. If the program -fsum- sees a "%" if the labels, it assumes the result should be displayed as a percentage and multiplies it by 100.
    BTW, the -cat() option tells -fsum- that he variables is a categorical variable and that the results should be displayed for each category.

    Making tlabels is kind of a pain, so I wrote a program to do this simply. I always use this program instead of Stata's label ... It is called nlabel. I guess I should post this to the archive soon
    N label is a perfect labeling command for implementing standard or tlabels without extra work. Here are some axamples of -nlabel- at work.

    gen ss= depev + fatigsev + muspain + cogsever + insomnia
    nlabel ss, label(Symptom score) ct k
    gen ppsd = jointsum + ss
    nlabel ppsd, label(NHIS PSD tottal adhoc) ct k

    gen notwork = inlist(wrklyr4,1,2,3) if wrklyr4 <7
    nlabel notwork, l(Not working) a((%)) k

    It is just about the same as -label ...-

    Here is an example of how it works:



    . des sex

    storage display value
    variable name type format label variable label
    -----------------------------------------------------------------------------------------------------------------------------
    sex byte %8.0g sex Gender

    . nlabel sex, l(Gender) a((% male))

    . nlabel sex
    variable: sex label: Gender tlabel: Gender (% male) clabel: value label: sex

    . fsum sex,v

    Variable | N Mean SD Min Max
    -----------------+---------------------------------------------
    Gender (% male) | 334406 18.78 sex


    Until I get a chance to post -nlabel- to the SSC, I can send it to people via email,

    Fred
    [email protected]


    Comment


    • #3
      Hey, Fred, thanks for your valuable clarification, However, there might be some abigumous in my expression, what I want is some thing like this

      You see that the category variable is on the heading row and other variable labels on the first column. Using --tabstat-- I can basically mimic it , though I can only show the variable names rather than labels as I mentioned. I believe --fsum-- is a strengthen of --tabstat-- then I can I do this in --fsum-- with some easy option like by() in --tabstat--. You see the by() I mean is not the prefix bys
      I want a single panel not a separated one you illustrated . Thank you so much.

      Comment

      Working...
      X