Multiple bars in a single chart

Peter Suntan

Join Date: Nov 2020

Posts: 10
#1

Multiple bars in a single chart

25 Nov 2020, 07:49

Hi all,
I have very basic question that I cannot find an answer to. I have tried all sorts of options from stacked bar chart to caplets, but can't seem to figure out how to make a chart that's very simple in Excel! The chart I am trying to create is something like the one attached here. Basically, it's survey data- Each of the bars are different variables that rate an organization's performance on those items from highly unsatisfactory (-2) to highly satisfactory (+2 ). Each of the rows is ratings by respondents. I can do one bar at a time, but I would like all bars to be in the same chart for compact presentation. Anyway to do this in Stata or should I export data to Excel for this? So, basically each of the slices in a bar would be the frequency of each category of responses within a variable (e.g., 10 highly unsatisfactory, 5 unsatisfactory.... on var1, etc.)

Any help will be most appreciated. Thanks.

The data is as follows:
ID var1 var2 var3 var4 var5 var6
1 -2 1 0 -1 2 1
2 2 1 0 1 2 1
3 2 1 0 -1 2- 1
4 -2 1 0 -1 2- -1

Attached Files
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35585
#2

25 Nov 2020, 17:21

There is an ugly solution of producing one bar chart for each variable and then using graph combine.

A better solution is just to reshape long so that you then have just two variables. The graphs here look a little silly based on just 4 observations, but they could be a start. You would need to do a little work copying your variable labels to value labels of an aggregated variable, as otherwise they disappear during the reshape long. The code shows some technique for that.

The catplot (SSC)code gives a stacked chart broadly similar to what you asked for. It seems to me that your colours should be ordered. There is no doubt similar code with graph hbar (percent) or some such but I am less fluent in that corner of graph hbar than with catplot.

My bias is that stacked plots are oversold. The fact that the percents add to 100% is emphasised strongly by the design, but we know that. The price of stacking is a legend and difficulty reading off very small quantities (including zero) when they occur. I usually prefer what might be called a twoway bar chart although many other names are used too. There is more at https://www.statalist.org/forums/for...updated-on-ssc and http://www.stata-journal.com/article...article=gr0066

The 0.16 is just the result of trial and error to get better alignment. For other datasets different adjustments might be needed.

Code:

clear input ID var1 var2 var3 var4 var5 var6 1 -2 1 0 -1 2 1 2 2 1 0 1 2 1 3 2 1 0 -1 2- 1 4 -2 1 0 -1 2- -1 end * invent silly variable labels: none in example tokenize "frogs toads newts cats dogs horses" forval j = 1/6 { label var var`j' "``j''" } describe * save variable labels forval j = 1/6 { local lbl`j' : var label var`j' } preserve reshape long var, i(ID) j(which) * saved variable labels become value labels forval j = 1/6 { label def which `j' "`lbl`j''", modify } label val which which * install from SSC catplot var which , asyvars stack percent(which) legend(row(1)) /// bar(1, color(red*0.6)) bar(2, color(red*0.2)) bar(3, color(blue*0.2)) /// bar(4, color(blue*0.6)) bar(5, color(blue)) name(G1, replace) * install from SJ tabplot which var, horizontal separate(var) percent(which) subtitle(% for each variable) /// bar1(lc(red) fcolor(red*0.6)) bar2(lc(red) fcolor(red*0.2)) bar3(lc(blue) fcolor(blue*0.2)) /// bar4(lc(blue) fcolor(blue*0.6)) bar5(color(blue)) showval(offset(0.17) format(%2.0f)) xtitle(rating) ytitle("") name(G2, replace)

Last edited by Nick Cox; 25 Nov 2020, 17:24.
1 like
Comment
Peter Suntan

Join Date: Nov 2020

Posts: 10
#3

25 Nov 2020, 19:11

Dear Nick,
Thanks a million for your ultrafast and perfect response! I had already installed catplot and tabplot, but didn't realize I need to reshape long. That's where I was stuck. Thank you!!! A very happy Thanksgiving to you and yours!
Best,
Pete
Comment
Mary Atieno

Join Date: Jul 2021

Posts: 38
#4

22 Sep 2022, 02:12

Originally posted by Nick Cox View Post

There is an ugly solution of producing one bar chart for each variable and then using graph combine.

A better solution is just to reshape long so that you then have just two variables. The graphs here look a little silly based on just 4 observations, but they could be a start. You would need to do a little work copying your variable labels to value labels of an aggregated variable, as otherwise they disappear during the reshape long. The code shows some technique for that.

The catplot (SSC)code gives a stacked chart broadly similar to what you asked for. It seems to me that your colours should be ordered. There is no doubt similar code with graph hbar (percent) or some such but I am less fluent in that corner of graph hbar than with catplot.

My bias is that stacked plots are oversold. The fact that the percents add to 100% is emphasised strongly by the design, but we know that. The price of stacking is a legend and difficulty reading off very small quantities (including zero) when they occur. I usually prefer what might be called a twoway bar chart although many other names are used too. There is more at https://www.statalist.org/forums/for...updated-on-ssc and http://www.stata-journal.com/article...article=gr0066

The 0.16 is just the result of trial and error to get better alignment. For other datasets different adjustments might be needed.

Code:

clear input ID var1 var2 var3 var4 var5 var6 1 -2 1 0 -1 2 1 2 2 1 0 1 2 1 3 2 1 0 -1 2- 1 4 -2 1 0 -1 2- -1 end * invent silly variable labels: none in example tokenize "frogs toads newts cats dogs horses" forval j = 1/6 { label var var`j' "``j''" } describe * save variable labels forval j = 1/6 { local lbl`j' : var label var`j' } preserve reshape long var, i(ID) j(which) * saved variable labels become value labels forval j = 1/6 { label def which `j' "`lbl`j''", modify } label val which which * install from SSC catplot var which , asyvars stack percent(which) legend(row(1)) /// bar(1, color(red*0.6)) bar(2, color(red*0.2)) bar(3, color(blue*0.2)) /// bar(4, color(blue*0.6)) bar(5, color(blue)) name(G1, replace) * install from SJ tabplot which var, horizontal separate(var) percent(which) subtitle(% for each variable) /// bar1(lc(red) fcolor(red*0.6)) bar2(lc(red) fcolor(red*0.2)) bar3(lc(blue) fcolor(blue*0.2)) /// bar4(lc(blue) fcolor(blue*0.6)) bar5(color(blue)) showval(offset(0.17) format(%2.0f)) xtitle(rating) ytitle("") name(G2, replace)

[ATTACH=CONFIG]n1583402[/ATTACH]

[ATTACH=CONFIG]n1583406[/ATTACH]

Last edited by Mary Atieno; 22 Sep 2022, 02:36.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35585
#5

22 Sep 2022, 02:30

Please note our longstanding request not to attach .dta files but to use dataex to give example data. https://www.statalist.org/forums/help#stata

Otherwise the same advice applies. reshape long first.

Then consider using tabplot as above.

For this kind of data you have the extra option of a floating or sliding bar chart as implemented by floatplot from SSC. See e.g. https://www.statalist.org/forums/for...kert-variables
Comment
Mary Atieno

Join Date: Jul 2021

Posts: 38
#6

22 Sep 2022, 02:40

Hello Nick, how do I rescale the label sizes so that they are neat and legible, without having to open graph editor every time I re-run the code? See attached figure
Attached Files

sample-graph.gph (4.1 KB, 1 view)
Comment
Mary Atieno

Join Date: Jul 2021

Posts: 38
#7

22 Sep 2022, 02:50

Originally posted by Nick Cox View Post

Please note our longstanding request not to attach .dta files but to use dataex to give example data. https://www.statalist.org/forums/help#stata

Otherwise the same advice applies. reshape long first.

Then consider using tabplot as above.

For this kind of data you have the extra option of a floating or sliding bar chart as implemented by floatplot from SSC. See e.g. https://www.statalist.org/forums/for...kert-variables

Thank you Nick, my apologies for this. The code actually worked when I adjusted some variables with the little exception of the question I have plotted about rescaling the label sizes for the "key"
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35585
#8

22 Sep 2022, 04:29

Again, the first link in #5 explains our longstanding request not to add .gph attachments, not to post .png.

I can't reconcile your sample graph and your posted data. Your data show 5 possible outcomes. Your graph shows 6. Your data show 5 outcomes occurring for question a, but your graph shows only 3 so far as I can see. And so on. Some but not all of the differences may stem from whether Don't knows are included.

No matter. You just can't show a horizontal legend with 6 lengthy names readably unless you put the legend on 2 or 3 rows, and even then by doing that you lose space that you need for showing the data Or you could put a legend vertically on the right of the graph. This is standard detail with a legend() option: see its help.

A more radical solution is to use some scheme such as ? -- - 0 + ++ to indicate answers.

Equally showing answers for 8 questions as in your sample graph is a squeeze.

As before, the standard stacked design is in my view a poor choice for this kind of data. Here is a token floatplot for your sample data. Adding text to the legend would increase the squeeze.

Code:

use sample-dataset.dta, clear reshape long Ccami_, i(id2) j(Question) string replace Question = substr(Question, 1, 1) set scheme s1color floatplot Ccami_ if Q <= "h", over(Q) centre(3) fcolors(red red*0.5 gs12 blue*0.5 blue) vertical ytitle(Answer) legend(symxsize(small))
Comment
Mary Atieno

Join Date: Jul 2021

Posts: 38
#9

22 Sep 2022, 14:31

Originally posted by Nick Cox View Post

Again, the first link in #5 explains our longstanding request not to add .gph attachments, not to post .png.

I can't reconcile your sample graph and your posted data. Your data show 5 possible outcomes. Your graph shows 6. Your data show 5 outcomes occurring for question a, but your graph shows only 3 so far as I can see. And so on. Some but not all of the differences may stem from whether Don't knows are included.

No matter. You just can't show a horizontal legend with 6 lengthy names readably unless you put the legend on 2 or 3 rows, and even then by doing that you lose space that you need for showing the data Or you could put a legend vertically on the right of the graph. This is standard detail with a legend() option: see its help.

A more radical solution is to use some scheme such as ? -- - 0 + ++ to indicate answers.

Equally showing answers for 8 questions as in your sample graph is a squeeze.

As before, the standard stacked design is in my view a poor choice for this kind of data. Here is a token floatplot for your sample data. Adding text to the legend would increase the squeeze.

Code:

use sample-dataset.dta, clear reshape long Ccami_, i(id2) j(Question) string replace Question = substr(Question, 1, 1) set scheme s1color floatplot Ccami_ if Q <= "h", over(Q) centre(3) fcolors(red red*0.5 gs12 blue*0.5 blue) vertical ytitle(Answer) legend(symxsize(small))

[ATTACH=CONFIG]n1682924[/ATTACH]

I am still using STATA version 15, which does not support floatplot. Are there any alternatives?
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35585
#10

22 Sep 2022, 14:40

That’s a misunderstanding. floatplot is community-contributed and so not bundled with any version of Stata on installation. You must install it using

Code:

ssc install floatplot

it should work fine with Stata 15.

Correction: The code specifies version 17. At this time, I can’t recall why I did that. So, edit the code to say version 15. If there really is a reason why it needs 17, you will find out quickly….

Last edited by Nick Cox; 22 Sep 2022, 14:48.
Comment
Mary Atieno

Join Date: Jul 2021

Posts: 38
#11

22 Sep 2022, 14:51

Originally posted by Nick Cox View Post

Again, the first link in #5 explains our longstanding request not to add .gph attachments, not to post .png.

I can't reconcile your sample graph and your posted data. Your data show 5 possible outcomes. Your graph shows 6. Your data show 5 outcomes occurring for question a, but your graph shows only 3 so far as I can see. And so on. Some but not all of the differences may stem from whether Don't knows are included.

No matter. You just can't show a horizontal legend with 6 lengthy names readably unless you put the legend on 2 or 3 rows, and even then by doing that you lose space that you need for showing the data Or you could put a legend vertically on the right of the graph. This is standard detail with a legend() option: see its help.

A more radical solution is to use some scheme such as ? -- - 0 + ++ to indicate answers.

Equally showing answers for 8 questions as in your sample graph is a squeeze.

As before, the standard stacked design is in my view a poor choice for this kind of data. Here is a token floatplot for your sample data. Adding text to the legend would increase the squeeze.

Code:

use sample-dataset.dta, clear reshape long Ccami_, i(id2) j(Question) string replace Question = substr(Question, 1, 1) set scheme s1color floatplot Ccami_ if Q <= "h", over(Q) centre(3) fcolors(red red*0.5 gs12 blue*0.5 blue) vertical ytitle(Answer) legend(symxsize(small))

[ATTACH=CONFIG]n1682924[/ATTACH]

Last edited by Mary Atieno; 22 Sep 2022, 14:53.
Comment
Mary Atieno

Join Date: Jul 2021

Posts: 38
#12

22 Sep 2022, 14:59

Originally posted by Nick Cox View Post

That’s a misunderstanding. floatplot is community-contributed and so not bundled with any version of Stata on installation. You must install it using

Code:

ssc install floatplot

it should work fine with Stata 15.

Correction: The code specifies version 17. At this time, I can’t recall why I did that. So, edit the code to say version 15. If there really is a reason why it needs 17, you will find out quickly….

I used this code

replace Question = substr(Question, 1, 1)
set scheme s1color
floatplot Ccami_ if Q <= "h", over(Q) centre(3) fcolors(red red*0.5 gs12 blue*0.5 blue) vertical ytitle(Answer) legend(symxsize(small))

and got the error message
this is version 15.1 of Stata; it cannot run version 17.0 programs
You can purchase the latest version of Stata by visiting http://www.stata.com.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35585
#13

22 Sep 2022, 17:58

I already answered #12 in the Correction you cite.

The code specifies version 17. At this time, I can’t recall why I did that. So, edit the code to say version 15. If there really is a reason why it needs 17, you will find out quickly

So, use a text editor to change an early statement in floatplot.ado from

Code:

version 17

to

Code:

version 15

and then save the changed file Then type

Code:

discard

to flush its command code from memory. Now try again.
Comment

Announcement

Multiple bars in a single chart

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment