Plotting a stack bar

Nader Mehri

Join Date: Jun 2019
Posts: 189

Plotting a stack bar

27 May 2023, 18:11

Hi statalisters,
For the below dataset, the variables of PUWLE, PHWLE, POWLE, and POBLE are percentages representing the proportions of TLE_01. I would like to create two separate bar plots separately for females and males. For each sex group, I would like to create a stack bar plot of the proportions noted above by sample_.
Thanks,
Nader

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input str18 sample_ str6 sex float(PUWLE PHWLE POWLE POBLE TLE_01)
"Non-Hispanic White" "Female" 2.2562733 32.15409 32.447655 33.141983  31.18791
"Black"              "Female"  1.338273 16.99335  30.90398   50.7644   28.9989
"Hispanic"           "Female" 1.5642548 24.18444 35.418617  38.83269 32.706757
"Non-Hispanic White" "Male"   2.2562733 32.15409 32.447655 33.141983  31.18791
"Black"              "Male"    1.338273 16.99335  30.90398   50.7644   28.9989
"Hispanic"           "Male"   1.5642548 24.18444 35.418617  38.83269 32.706757
end

Last edited by Nader Mehri; 27 May 2023, 18:18.

Tags: graph, graphics

Nick Cox

Join Date: Mar 2014
Posts: 35782

28 May 2023, 03:35

What you are asking for is a standard example of graph bar, or so I guess. I throw in here using graph hbar as well, and indeed also tabplot from the Stata Journal.

The main idea of tabplot can be seen at https://www.statalist.org/forums/for...updated-on-ssc

but if you want to use it, it is best to download the latest (public) ado and help from the latest gr0066 shown by

Code:

. search gr0066, entry

Search of official help files, FAQs, Examples, and Stata Journals

SJ-22-2 gr0066_3  . . . . . . . . . . . . . . . .  Software update for tabplot
        (help tabplot if installed) . . . . . . . . . . . . . . . .  N. J. Cox
        Q2/22   SJ 22(2):467
        bug fixed; help file updated to include further references

SJ-20-3 gr0066_2  . . . . . . . . . . . . . . . .  Software update for tabplot
        (help tabplot if installed) . . . . . . . . . . . . . . . .  N. J. Cox
        Q3/20   SJ 20(3):757--758
        added new options frame() and frameopts() allowing framing
        of bars and so-called thermometer plots or charts

SJ-17-3 gr0066_1  . . . . . . . . . . . . . . . .  Software update for tabplot
        (help tabplot if installed) . . . . . . . . . . . . . . . .  N. J. Cox
        Q3/17   SJ 17(3):779
        added options for reversing axis scales; improved handling of
        axis labels containing quotation marks

SJ-16-2 gr0066  . . . . . .  Speaking Stata: Multiple bar charts in table form
        (help tabplot if installed) . . . . . . . . . . . . . . . .  N. J. Cox
        Q2/16   SJ 16(2):491--510
        provides multiple bar charts in table form representing
        contingency tables for one, two, or three categorical variables

so (at the time of writing) download from gr0066_3.

The leading ideas of tabplot as compared with stacking bars that add to 100% are

* You can tell the reader once that the data are percents and add to 100 within certain groups, and that should be easy to grasp. If not, you need a smarter reader. You don't need to reinforce that notion graphically.

* You can lose the legend (kill the key) and cut down on mental back and forth.

* You can optionally show the percents themselves. You can do that with a stacked bar chart, but the percents are not so easy to read.

* There is no extra strain if some components are very small or even zero. That is easy to spot and think about.

Here is some code.

Code:

set scheme stcolor 

clear
input str18 sample_ str6 sex float(PUWLE PHWLE POWLE POBLE TLE_01)
"Non-Hispanic White" "Female" 2.2562733 32.15409 32.447655 33.141983  31.18791
"Black"              "Female"  1.338273 16.99335  30.90398   50.7644   28.9989
"Hispanic"           "Female" 1.5642548 24.18444 35.418617  38.83269 32.706757
"Non-Hispanic White" "Male"   2.2562733 32.15409 32.447655 33.141983  31.18791
"Black"              "Male"    1.338273 16.99335  30.90398   50.7644   28.9989
"Hispanic"           "Male"   1.5642548 24.18444 35.418617  38.83269 32.706757
end

graph bar (asis) *LE , stack over(sex) over(sample_) name(G1, replace)

graph bar (asis) *LE , stack over(sample_) over(sex) name(G2, replace)

graph hbar (asis) *LE , stack over(sample_) over(sex) name(G3, replace)


graph hbar (asis) *LE , stack over(sex) over(sample_) name(G4, replace)


capture frame drop work 

frame put *, into(work)

frame work { 
    reshape long @LE , i(sample_ sex) j(which) string 
    list 
    tabplot which sample_ [iw=LE], by(sex, note("")) showval(format(%3.1f)) xtitle("") ytitle("") name(G5, replace)
    tabplot which sex [iw=LE], by(sample_, note("") row(1)) showval(format(%3.1f)) xtitle("") ytitle("") name(G6, replace)
    local opts separate(which) 
    tabplot which sample_ [iw=LE], `opts' by(sex, note("")) showval(format(%3.1f)) xtitle("") ytitle("") name(G7, replace)
    tabplot which sex [iw=LE], `opts' by(sample_, row(1) note("")) showval(format(%3.1f)) xtitle("") ytitle("") name(G8, replace)
}

The second graph is clearly worthless given the overlap of text labels. That could be fixed with some twiddling but I have not bothered because I think better graphs are on offer. I wanted to make a standard point that graph hbar often is preferable to graph bar

Click image for larger version

Name: mehri_GG1.png
Views: 1
Size: 40.0 KB
ID: 1715234

Click image for larger version

Name: mehri_GG2.png
Views: 1
Size: 44.6 KB
ID: 1715235

Click image for larger version

Name: mehri_GG3.png
Views: 1
Size: 40.0 KB
ID: 1715236

Click image for larger version

Name: mehri_GG4.png
Views: 1
Size: 39.3 KB
ID: 1715237

Comment

Nick Cox

Join Date: Mar 2014

Posts: 35782
#3

28 May 2023, 03:42

The other graphs follow because the forum software limits the number of attachments per post.

Attached Files
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35782
#4

28 May 2023, 03:45

The graphs above in #2 are out of order. Oh well.

All that said, in your data example the values for males and females appear identical for the same variables!

It's your project, not mine, manifestly but

* Horizontal stacked bars can work better than vertical.

* I think the tabplot results work better than either stacked flavour. I guess that people use stacked bar charts because they have seen so many, as many people use pie charts despite their having been shot down with rational arguments as poor designs over a century or more.

* What you choose depends on which comparisons are more important. Minute comparisons are easiest between bars side by side. Side by side can be better or worse depend on which comparisons are more interesting or important.

* Mixing colours is essential for stacked designs and optional otherwise.

This thread is more or less a repeat of your earlier thread

https://www.statalist.org/forums/for...-xlabel-s-size

in which the main point was that stacked bars are awkward at best, lousy at worst, and better not used when there are superior choices.

Last edited by Nick Cox; 28 May 2023, 04:02.
1 like
Comment

Nader Mehri

Join Date: Jun 2019
Posts: 189

28 May 2023, 07:11

Thank you so much for your helpful response. Based on your suggestions and my earlier thread, I have created the below plot using the below code. I wonder how my plot can be modified by 1) changing the angle and the font for x-bar values to avoid their overlapping and 2) removing the background color for the bars so the plot can be printed using a white-black printer.
Nader

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input str1 gender byte(category_num which age) str1 sample_ float(percent TLE) str2 profile float(PUWLE PHWLE POWLE POBLE) str3 category
"F" 0 1 50 "0"  .7100313 31.218945 "F0" 2.3 32.3 32.4   33 "F01"
"F" 0 2 50 "0"  10.09634 31.218945 "F0" 2.3 32.3 32.4   33 "F02"
"F" 0 3 50 "0"  10.11646 31.218945 "F0" 2.3 32.3 32.4   33 "F03"
"F" 0 4 50 "0" 10.296116 31.218945 "F0" 2.3 32.3 32.4   33 "F04"
"F" 1 1 50 "1"   .394898 29.063833 "F1" 1.4 17.2 31.1 50.4 "F11"
"F" 1 2 50 "1"   4.99093 29.063833 "F1" 1.4 17.2 31.1 50.4 "F12"
"F" 1 3 50 "1"  9.044346 29.063833 "F1" 1.4 17.2 31.1 50.4 "F13"
"F" 1 4 50 "1" 14.633658 29.063833 "F1" 1.4 17.2 31.1 50.4 "F14"
"F" 2 1 50 "2"  .4906612 32.613693 "F2" 1.5 23.5 35.3 39.7 "F21"
"F" 2 2 50 "2"  7.664418 32.613693 "F2" 1.5 23.5 35.3 39.7 "F22"
"F" 2 3 50 "2" 11.499845 32.613693 "F2" 1.5 23.5 35.3 39.7 "F23"
"F" 2 4 50 "2"  12.95877 32.613693 "F2" 1.5 23.5 35.3 39.7 "F24"
"F" 3 1 50 "3"  .7716495  31.10333 "F3" 2.5 34.2 31.7 31.6 "F31"
"F" 3 2 50 "3" 10.639664  31.10333 "F3" 2.5 34.2 31.7 31.6 "F32"
"F" 3 3 50 "3"  9.847922  31.10333 "F3" 2.5 34.2 31.7 31.6 "F33"
"F" 3 4 50 "3"  9.844094  31.10333 "F3" 2.5 34.2 31.7 31.6 "F34"
end
label values category_num category_num
label def category_num 0 "non-Hispanic White", modify
label def category_num 1 "non-Hispanic Black", modify
label def category_num 2 "Hispanic", modify
label def category_num 3 "non-Hispanic other", modify
label values which which
label def which 1 "Underweight", modify
label def which 2 "Healthy Weight", modify
label def which 3 "Overweight", modify
label def which 4 "Obes", modify

jj.pdf

Comment

Nick Cox

Join Date: Mar 2014

Posts: 35782
#6

28 May 2023, 08:34

Image attachments should please be shown as .png. https://www.statalist.org/forums/help#stata 12.4
1 like
Comment
Nader Mehri

Join Date: Jun 2019

Posts: 189
#7

28 May 2023, 09:03

Sorry about that! Please see the plot in the .png format as follows:

Last edited by Nader Mehri; 28 May 2023, 09:11.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35782
#8

28 May 2023, 09:30

I would just fix the value labels as the main problem is the longer label for 2 and you have enough space to correct the spelling for 4.

Code:

label def which 1 "Underweight", modify label def which 2 "Healthy", modify label def which 3 "Overweight", modify label def which 4 "Obese", modify

If this is destined for a black and white printer, you should work throughout with a scheme such as s1mono.

The showval() option has an offset() suboption to move the text. Here the text needs to move up a little;
1 like
Comment
Nader Mehri

Join Date: Jun 2019

Posts: 189
#9

28 May 2023, 12:14

Thanks! I wonder how the bars for each race category could be changed to the same color; i.e., how the color for bars for non-Hispanic Whites can be changed all to blue, while the colors for bars for non-Hispanic Black can be changed to green, etc.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35782
#10

28 May 2023, 13:58

What happened to the black-and-white printer?
1 like
Comment
Nader Mehri

Join Date: Jun 2019

Posts: 189
#11

28 May 2023, 17:09

Well, I need this for PowerPoint presentation!
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35782
#12

29 May 2023, 01:05

It is just a different separate() call: separate(category_num) with bar1() ... bar4() to override default colours if you wish.
1 like
Comment

Nader Mehri

Join Date: Jun 2019
Posts: 189

#13

29 May 2023, 21:15

Thanks for your solution. I have tried the following code and got the below plot. Assigning a color to each bar is a little bit hectic particularly if one is dealing with several plots with multiple bars. I wonder if there is any way to assign a color to bars number 1 to 6, another color to bars number 7 to 12, etc.

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input str6 gender byte(category_num which age) float(Underweight Healthy Overweight Obese TLE) str3 profile byte(birth_place race) float percent str4 category str5 toshow
"Female"  1 1 50 .7  9.9   10 10.5   31 "F01" 1 0  2.2 "F11"  "2.2%" 
"Female"  1 2 50 .7  9.9   10 10.5   31 "F01" 1 0 31.7 "F12"  "31.7%"
"Female"  1 3 50 .7  9.9   10 10.5   31 "F01" 1 0 32.3 "F13"  "32.3%"
"Female"  1 4 50 .7  9.9   10 10.5   31 "F01" 1 0 33.8 "F14"  "33.8%"
"Female"  2 1 50 .9 12.3 10.9  9.2 33.3 "F02" 2 0  2.7 "F21"  "2.7%" 
"Female"  2 2 50 .9 12.3 10.9  9.2 33.3 "F02" 2 0   37 "F22"  "37.0%"
"Female"  2 3 50 .9 12.3 10.9  9.2 33.3 "F02" 2 0 32.7 "F23"  "32.7%"
"Female"  2 4 50 .9 12.3 10.9  9.2 33.3 "F02" 2 0 27.7 "F24"  "27.7%"
"Female" 11 1 50 .4  4.7  8.8 14.6 28.5 "F11" 1 1  1.4 "F111" "1.4%" 
"Female" 11 2 50 .4  4.7  8.8 14.6 28.5 "F11" 1 1 16.3 "F112" "16.3%"
"Female" 11 3 50 .4  4.7  8.8 14.6 28.5 "F11" 1 1   31 "F113" "31.0%"
"Female" 11 4 50 .4  4.7  8.8 14.6 28.5 "F11" 1 1 51.3 "F114" "51.3%"
"Female" 12 1 50 .6  6.3 10.2 13.7 30.8 "F12" 2 1  1.8 "F121" "1.8%" 
"Female" 12 2 50 .6  6.3 10.2 13.7 30.8 "F12" 2 1 20.5 "F122" "20.5%"
"Female" 12 3 50 .6  6.3 10.2 13.7 30.8 "F12" 2 1 33.1 "F123" "33.1%"
"Female" 12 4 50 .6  6.3 10.2 13.7 30.8 "F12" 2 1 44.6 "F124" "44.6%"
"Female" 21 1 50 .5  7.6 11.3 12.9 32.4 "F21" 1 2  1.5 "F211" "1.5%" 
"Female" 21 2 50 .5  7.6 11.3 12.9 32.4 "F21" 1 2 23.6 "F212" "23.6%"
"Female" 21 3 50 .5  7.6 11.3 12.9 32.4 "F21" 1 2   35 "F213" "35.0%"
"Female" 21 4 50 .5  7.6 11.3 12.9 32.4 "F21" 1 2 39.9 "F214" "39.9%"
"Female" 22 1 50 .7  9.7 12.5 11.7 34.5 "F22" 2 2  1.9 "F221" "1.9%" 
"Female" 22 2 50 .7  9.7 12.5 11.7 34.5 "F22" 2 2 28.1 "F222" "28.1%"
"Female" 22 3 50 .7  9.7 12.5 11.7 34.5 "F22" 2 2 36.1 "F223" "36.1%"
"Female" 22 4 50 .7  9.7 12.5 11.7 34.5 "F22" 2 2 33.9 "F224" "33.9%"
"Female" 31 1 50 .8 10.4  9.7  9.9 30.7 "F31" 1 3  2.5 "F311" "2.5%" 
"Female" 31 2 50 .8 10.4  9.7  9.9 30.7 "F31" 1 3 33.8 "F312" "33.8%"
"Female" 31 3 50 .8 10.4  9.7  9.9 30.7 "F31" 1 3 31.6 "F313" "31.6%"
"Female" 31 4 50 .8 10.4  9.7  9.9 30.7 "F31" 1 3 32.1 "F314" "32.1%"
"Female" 32 1 50  1   13 10.6  8.6 33.2 "F32" 2 3    3 "F321" "3.0%" 
"Female" 32 2 50  1   13 10.6  8.6 33.2 "F32" 2 3 39.3 "F322" "39.3%"
"Female" 32 3 50  1   13 10.6  8.6 33.2 "F32" 2 3 31.8 "F323" "31.8%"
"Female" 32 4 50  1   13 10.6  8.6 33.2 "F32" 2 3 25.9 "F324" "25.9%"
end
label values category_num sample_
label def sample_ 1 "US-born non-Hispanic White", modify
label def sample_ 2 "Foreign-born non-Hispanic White", modify
label def sample_ 11 "US-born non-Hispanic Black", modify
label def sample_ 12 "Foreign-born non-Hispanic Black", modify
label def sample_ 21 "US-born Hispanic", modify
label def sample_ 22 "Foreign-born Hispanic", modify
label def sample_ 31 "US-born non-Hispanic Other", modify
label def sample_ 32 "Foreign-born non-Hispanic Other", modify
label values which which
label def which 1 "Underweight", modify
label def which 2 "Healthy", modify
label def which 3 "Overweight", modify
label def which 4 "Obese", modify
label values birth_place birth_place
label def birth_place 1 "US-born", modify
label def birth_place 2 "Foreign-born", modify
label values race race
label def race 0 "non-Hispanic White", modify
label def race 1 "non-Hispanic Black", modify
label def race 2 "Hispanic", modify
label def race 3 "non-Hispanic Other", modify

tabplot category_num which [iw=percent], height(0.63) separate(category_num) by(gender , note("")) barall(bcolor(blue*0.5)) showval (toshow, offset(0.15)) horizontal ytitle("") xtitle("") xscale(r(0.7 1 3.4)) subtitle(, fcolor(none)) name(G1, replace) xlabel(, labsize(small))

Click image for larger version

Name: G1.png
Views: 1
Size: 82.9 KB
ID: 1715420

Comment

Nick Cox

Join Date: Mar 2014

Posts: 35782
#14

30 May 2023, 01:23

It is the same question. Create a variable with different values for 1-6 and 7-12 and so on and then feed it to separate().
Comment

Announcement

Plotting a stack bar

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment