Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Labelling stacked frequency bars with percentages

    I have a stacked bar graph (I know, I know — government request, what can you do?) showing the absolute numbers that have passed and failed a test, by district and source. I have added bar labels with the frequencies in each group, but need also to add a label, outside the end of each stacked bar, giving the percent failure in each group.

    The data look like this:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int district float(source N num_fail num_pass prev_fail) str4 prev_str float(graph_order bar_end) str5 pct_label
    1 4 110 14  96 12.727273 "12.7" 1 112 "12.7%"
    2 1 172 17 155  9.883721 "9.9"  1 174 "9.9%"
    2 2  26  4  22 15.384615 "15.4" 1  28 "15.4%"
    2 3  10  1   9        10 "10.0" 1  12 "10.0%"
    3 1  69 16  53 23.188406 "23.2" 1  71 "23.2%"
    3 2  19  4  15  21.05263 "21.1" 1  21 "21.1%"
    3 3  14  2  12 14.285714 "14.3" 1  16 "14.3%"
    4 1 118  6 112  5.084746 "5.1"  1 120 "5.1%"
    4 2  23  1  22  4.347826 "4.3"  1  25 "4.3%"
    4 3  23  1  22  4.347826 "4.3"  1  25 "4.3%"
    end
    label values district district_ex
    label def district_ex 1 "D1", modify
    label def district_ex 2 "D2", modify
    label def district_ex 3 "D3", modify
    label def district_ex 4 "D4", modify
    label values source source_graph_id
    label def source_graph_id 1 "Chain1", modify
    label def source_graph_id 2 "Chain2", modify
    label def source_graph_id 3 "Chain3", modify
    label def source_graph_id 4 "Chain4", modify



    There are several posts on this topic, but none have resolved my problem. (I did not adequately understand the advice on undocumented commands posted here: https://www.statalist.org/forums/for...-to-bar-graph; I also note the suggestion of using twoway, in the cited and similar threads, but have not found a way to do that with the variable breakdown I want.

    I created text lables out of the percentage variables (pct_label), then stored them in a local of the same name.
    I made variables that would (I think) give the coordinates for the chart plot points I would like the labels to be placed — just outside the end of each bar -- and stored them in locals x and y
    I tried to use the “text” option in the graph command to insert the percent labels using the locals for the coordinates and the label variable. (I tried with and without double quotes around the local names). In both cases, I get the error message. “invalid point, graph_order bar_end” [the latter being the names of the variables in the x and y locals]

    The code is as follows:
    Code:
    bysort district source: gen graph_order = _n // generates var to use for labelling coordinates
    gen bar_end = N+2 // generates var to use for labelling coordinates
    
    tostring prev_fail, gen(prev_str) format(%3.1f) force
    gen pct_label = prev_str+"%"
    local pct_label pct_label
    local x graph_order
    local y bar_end
    
    graph hbar num_pass num_fail,  over(district, label(labsize(small))) over(source, label(labsize(vsmall) angle(90))) stack nofill bar(1, fcolor(green*.8)) bar(2, fcolor(orange)) blabel(bar, position(center) format(%3.0f) color(white) size(vsmall)) ///
    blabel(bar, position(center) format(%3.1f))) ///
    text(`x' `y' "`pct_label'")

    I wondered if some sort of looping of the text would help, but my further experiments are too messy to share. Grateful for any advice you could give.

    Grateful, too, for advice on how I might successfully use tabplot to show data broken down by district and source; I have used it successfully with the uncollapsed version of these data for either or, but not for both simultenously.

    And while we're at it, is there a way of including bar labels only in one bar series (eg, the second series on a stacked bar, but not the first)?

    Many thanks

  • #2
    Not ideal with graph bar, but you could add a note "failure rates in parentheses" to this. The approach using twoway is involved, but do see the following thread #10: https://www.statalist.org/forums/for...s-to-the-right. If it is a one time thing, a manual approach as shown in the linked thread may rank highly.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int district float(source N num_fail num_pass prev_fail) str4 prev_str float(graph_order bar_end) str5 pct_label
    1 4 110 14  96 12.727273 "12.7" 1 112 "12.7%"
    2 1 172 17 155  9.883721 "9.9"  1 174 "9.9%"
    2 2  26  4  22 15.384615 "15.4" 1  28 "15.4%"
    2 3  10  1   9        10 "10.0" 1  12 "10.0%"
    3 1  69 16  53 23.188406 "23.2" 1  71 "23.2%"
    3 2  19  4  15  21.05263 "21.1" 1  21 "21.1%"
    3 3  14  2  12 14.285714 "14.3" 1  16 "14.3%"
    4 1 118  6 112  5.084746 "5.1"  1 120 "5.1%"
    4 2  23  1  22  4.347826 "4.3"  1  25 "4.3%"
    4 3  23  1  22  4.347826 "4.3"  1  25 "4.3%"
    end
    label values district district_ex
    label def district_ex 1 "D1", modify
    label def district_ex 2 "D2", modify
    label def district_ex 3 "D3", modify
    label def district_ex 4 "D4", modify
    label values source source_graph_id
    label def source_graph_id 1 "Chain1", modify
    label def source_graph_id 2 "Chain2", modify
    label def source_graph_id 3 "Chain3", modify
    label def source_graph_id 4 "Chain4", modify
    
    
    
    
    set scheme s1color
    graph hbar num_pass num_fail,  over(district, label(labsize(small))) ///
    over(source, label(labsize(vsmall) angle(90))) stack nofill ///
    bar(1, fcolor(green*.8)) bar(2, fcolor(orange)) blabel(bar, position(center) ///
    format(%3.0f) color(black) size(vsmall)) blabel(bar, position(base) format(%3.1f)) ///
    ylab(, nogrid) leg(off)
    
    local bars=`.Graph.plotregion1.barlabels.arrnels'
    forval i=2(2)`bars' {
        local j= `i'-1
      di "`.Graph.plotregion1.barlabels[`i'].text[1]'"
      .Graph.plotregion1.barlabels[`i'].text[1]="`.Graph.plotregion1.barlabels[`i'].text[1]'        (`:di %3.2f `=100*(`.Graph.plotregion1.barlabels[`i'].text[1]'/(`.Graph.plotregion1.barlabels[`i'].text[1]'+`.Graph.plotregion1.barlabels[`j'].text[1]'))'' %)"
    }
    .Graph.drawgraph
    Click image for larger version

Name:	Graph.png
Views:	1
Size:	54.9 KB
ID:	1684321

    Comment


    • #3
      As tabplot (from the Stata Journal) has been mentioned, I had a go at using it.

      Code:
      * Elizabeth's code with repetitions and errors commented out 
      clear
      input int district float(source N num_fail num_pass prev_fail) str4 prev_str float(graph_order bar_end) str5 pct_label
      1 4 110 14  96 12.727273 "12.7" 1 112 "12.7%"
      2 1 172 17 155  9.883721 "9.9"  1 174 "9.9%"
      2 2  26  4  22 15.384615 "15.4" 1  28 "15.4%"
      2 3  10  1   9        10 "10.0" 1  12 "10.0%"
      3 1  69 16  53 23.188406 "23.2" 1  71 "23.2%"
      3 2  19  4  15  21.05263 "21.1" 1  21 "21.1%"
      3 3  14  2  12 14.285714 "14.3" 1  16 "14.3%"
      4 1 118  6 112  5.084746 "5.1"  1 120 "5.1%"
      4 2  23  1  22  4.347826 "4.3"  1  25 "4.3%"
      4 3  23  1  22  4.347826 "4.3"  1  25 "4.3%"
      end
      label values district district_ex
      label def district_ex 1 "D1", modify
      label def district_ex 2 "D2", modify
      label def district_ex 3 "D3", modify
      label def district_ex 4 "D4", modify
      label values source source_graph_id
      label def source_graph_id 1 "Chain1", modify
      label def source_graph_id 2 "Chain2", modify
      label def source_graph_id 3 "Chain3", modify
      label def source_graph_id 4 "Chain4", modify
      
      * bysort district source: gen graph_order = _n // generates var to use for labelling coordinates
      * gen bar_end = N+2 // generates var to use for labelling coordinates
      
      * tostring prev_fail, gen(prev_str) format(%3.1f) force
      * gen pct_label = prev_str+"%"
      local pct_label pct_label
      local x graph_order
      local y bar_end
      
      graph hbar num_pass num_fail,  over(district, label(labsize(small))) over(source, label(labsize(vsmall) angle(90))) stack nofill bar(1, fcolor(green*.8)) bar(2, fcolor(orange)) blabel(bar, position(center) format(%3.0f) color(white) size(vsmall)) ///
      blabel(bar, position(center) format(%3.1f)) name(EP, replace)
      
      * text(`x' `y' "`pct_label'")
      
      * NJC code starts here 
      drop prev_* bar_end 
      expand 2, generate(which)
      label def which 0 fail 1 pass 
      label val which which 
      bysort source district (which) : gen num = cond(_n==1, num_fail, num_pass)
      
      gen axis = sum(district != district[_n-1]) + sum(source != source[_n-1])
      
      * install from Stata Journal 
      labmask axis, values(district) decode 
      
      levelsof axis, clean 
      
      gen where = 2.2 
      
      * install from Stata Journal 
      tabplot which axis [fw=num], separate(which) bar2(color(green*0.8)) bar1(color(orange)) ///
      xasis xla(`r(levels)', valuelabel noticks) showval(num, offset(0.05)) xtitle("") legend(off) /// 
      xmla(3 "Chain1"  7 "Chain2"  11 "Chain3" 14  "Chain4", tlength(*5) tlc(none) labsize(medsmall)) ///
      xsc(r(1 15)) ytitle(Number passed and failed) yla(, ang(h)) ymla(2.2 "% fail", noticks labsize(medsmall) labc(orange) ang(h)) subtitle("") ysc(r(. 2.25)) ///
      addplot(scatter where axis, ms(none) mla(pct_label) mlabpos(0) mlabc(orange)) name(NJC1, replace)
      
      tabplot which axis [fw=num], separate(which) percent(axis) bar2(color(green*0.8)) bar1(color(orange)) ///
      xasis xla(`r(levels)', valuelabel noticks) showval(offset(0.05)) xtitle("") legend(off) /// 
      xmla(3 "Chain1"  7 "Chain2"  11 "Chain3" 14  "Chain4", tlength(*5) tlc(none) labsize(medsmall)) ///
      xsc(r(1 15)) ytitle(Percent passed and failed) yla(, ang(h)) subtitle("") ///
      name(NJC2, replace)
      Click image for larger version

Name:	EP_NJC1.png
Views:	1
Size:	32.9 KB
ID:	1684389
      Click image for larger version

Name:	EPC_NJC2.png
Views:	1
Size:	35.0 KB
ID:	1684390

      Comment

      Working...
      X