Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Histogram with bin freq and % on yaxis

    Dear Statalist,

    I want to generate Histogram Stata with percent option and want counts/freq of bins on bar labels. Is there any option to do this? as I am doing the following code for it but it is giving % on bar labels instead of freq/count.

    histogram ALIVE_CHILD, bin(4) percent addlabel ytitle(%) xtitle("") title(Alive Children)

    Click image for larger version

Name:	hist_stata.JPG
Views:	1
Size:	16.9 KB
ID:	1557728


    I want to create the graph as below generated in SAS:
    Click image for larger version

Name:	hist_sas.JPG
Views:	1
Size:	23.0 KB
ID:	1557729


    Please suggest/help or is there any option to generate the same in Stata.

    Thanks,
    Rasool Bux

  • #2
    Your display from SAS includes counts of -2 and -1 children, so the leftmost bin is really narrower.

    As your variable is discrete, presumably, getting counts and percents directly allows you to avoid histogram code and use the flexibility of graph twoway.

    Here is an analogue. Leaving a gap between bars for a counted variable I take as psychological and aesthetic rather than logical or mathematical.

    Code:
    sysuse auto, clear
    bysort rep78 : gen freq = _N if rep78 < .
    count if rep78 < .
    gen percent = 100 * freq/r(N)
    set scheme s1color
    twoway bar percent rep78, barw(0.95) bfcolor(none)|| scatter percent rep78, ms(none) mla(freq) mlabpos(12) mlabsize(medium) legend(off) ysc(r(. 46)) yla(, ang(h))
    Click image for larger version

Name:	histogram.png
Views:	1
Size:	18.1 KB
ID:	1557748


    If you really want wider bins, so not just the original distinct values, you would need to arrange that yourself before you calculate counts and percents, but it is easy.

    Suppose you want bins of width 3, in your case necessarily starting at 0, so (0 1 2) (3 4 5) ....


    Code:
    clear
    set obs 1000
    set seed 2803
    gen ALIVE = rbinomial(20, 0.3)
    gen ALIVE_bin = 3 * ceil((ALIVE + 1)/3) - 1
    tab ALIVE ALIVE_bin
    
    
               |                       ALIVE_bin
         ALIVE |         2          5          8         11         14 |     Total
    -----------+-------------------------------------------------------+----------
             1 |         8          0          0          0          0 |         8
             2 |        22          0          0          0          0 |        22
             3 |         0         67          0          0          0 |        67
             4 |         0        130          0          0          0 |       130
             5 |         0        181          0          0          0 |       181
             6 |         0          0        181          0          0 |       181
             7 |         0          0        188          0          0 |       188
             8 |         0          0        103          0          0 |       103
             9 |         0          0          0         66          0 |        66
            10 |         0          0          0         40          0 |        40
            11 |         0          0          0         12          0 |        12
            12 |         0          0          0          0          1 |         1
            13 |         0          0          0          0          1 |         1
    -----------+-------------------------------------------------------+----------
         Total |        30        378        472        118          2 |     1,000
    The principle extends to any other integer bin width. You may have access to a wider discussion at https://www.stata-journal.com/article.html?article=dm0095

    Comment


    • #3
      Thanks Mr. Cox,

      I want to display the counts of bin, not the categories. I have used the following code to generate the bin count but the numbers are not matching along with freq of variable, please suggest.

      . tab ALIVE_CHILD

      Number of |
      living |
      children | Freq. Percent Cum.
      ------------+-----------------------------------
      0 | 210 10.29 10.29
      1 | 608 29.79 40.08
      2 | 495 24.25 64.33
      3 | 283 13.87 78.20
      4 | 175 8.57 86.77
      5 | 124 6.08 92.85
      6 | 59 2.89 95.74
      7 | 45 2.20 97.94
      8 | 18 0.88 98.82
      9 | 13 0.64 99.46
      10 | 6 0.29 99.76
      11 | 2 0.10 99.85
      12 | 1 0.05 99.90
      13 | 1 0.05 99.95
      14 | 1 0.05 100.00
      ------------+-----------------------------------
      Total | 2,041 100.00

      gen child_bin = 4 * floor(ALIVE_CHILD /4)
      . ta child_bin

      child_bin | Freq. Percent Cum.
      ------------+-----------------------------------
      0 | 1,596 78.20 78.20
      4 | 403 19.75 97.94
      8 | 39 1.91 99.85
      12 | 3 0.15 100.00
      ------------+-----------------------------------
      Total | 2,041 100.00

      twoway bar percent child_bin, barw(0.95) || scatter percent child_bin, ms(none) mla(freq) mlabpos(12) mlabsize(medium) legend(off) yla(, ang(h))

      Click image for larger version

Name:	graph.JPG
Views:	1
Size:	19.0 KB
ID:	1558001

      Comment


      • #4
        I don't understand #3 at all. My code shows counts and yours does too. The counts 1596 503 ... on the Figure are precisely those in the table.

        Not the question, but I can't imagine any good reason for binning those data. The original detail is interesting, informative and easily displayed.

        Incidentally, calling me Nick is fine and what I ask for.

        Comment


        • #5
          Dear Mr. Nick,

          Thanks for replay. I also don't understand but when I run the following code in SAS, it generate the above graph. I am binning the b/c I am running a loop on continuous variables to generate all graphs with same pattern. This is an example one of them.

          proc sgplot data=_adamnn;
          histogram ALIVE_CHILD / nbins=4 scale=percent datalabel=count showbins FILLATTRS=(color=orange transparency=.5);
          xaxis display=(nolabel);
          yaxis label="%";
          run;

          Thanks,
          Rasool Bux

          Comment


          • #6
            I have never used SAS and can't comment on anything it does. I think I answered your Stata question already in #2.

            Comment

            Working...
            X