Diverging stacked bar chart

Uwe Russ

Join Date: Sep 2016

Posts: 2
#1

Diverging stacked bar chart

05 Sep 2016, 07:03

Hi everyone,

I want to display the frequencies of one categorical variable (attitude) over the categories of another (country) in a stacked bar chart. The attitude variable has 5 (Likert scale) categories. I have two specific aims, which I do not know how to accomplish using Stata:
The middle or neutral category of the attitude variable should be at the center of the graph over all the countries.

The stacked bars should be sorted according to the frequency of one (or more) categories of the attitude variable.

For an example of what I have in mind, see:
Figure 2 of this article http://www.amstat.org/sections/srms/...0784_64164.pdf by Naomi B. Robbins and Richard M. Heiberger (2011): Plotting Likert and Other Rating Scales

The first image under the section "Diverging Stacked Bar Charts" on this website: http://peltiertech.com/charting-survey-results/

Using Stata 12.1 I tried the following which did not exactly yield the desired results:

Code:

slideplot hbar attitude , by(country) percent neg(1 2 3) pos(4 5)

But slideplot cannot sort the bars. Also, you either have to chose whether the middle category 3 goes to the left or to the right, or you have to leave it out completely.

Code:

tab attitude, gen(attitudeCat) graph hbar attitudeCat1 attitudeCat2 attitudeCat3 attitudeCat4 attitudeCat5 /// , percent stack over(country, sort(5) descending)

With graph hbar I can at least sort the stacked bars according to one of the categories, but I still cannot center them around their middle category.

Does anyone have a suggestion of how to do this? Any help is much appreciated!

Kind regards,
Uwe

I produced a tiny example dataset in case you need it:

Code:

clear input float attitude long country 4 2 4 2 5 2 4 2 5 2 2 2 5 2 4 2 5 2 4 2 5 2 1 2 1 2 1 2 .c 2 5 3 3 3 2 3 4 3 4 3 5 3 .c 3 5 3 5 3 5 3 4 3 1 3 1 3 1 3 1 3 .c 4 3 4 4 4 5 4 3 4 5 4 4 4 4 4 5 4 4 4 .c 4 5 5 5 5 3 5 5 5 4 5 3 5 4 5 2 5 4 5 5 5 4 5 5 5 2 5 4 5 4 5 5 5 4 5 5 5 3 5 5 5 4 5 .c 5 1 6 4 6 5 6 4 6 3 6 4 6 4 6 5 6 4 6 2 6 4 6 4 6 2 6 1 6 5 6 2 6 2 6 5 6 4 6 4 6 4 6 4 7 4 7 5 7 1 7 4 7 3 7 4 7 5 7 1 7 1 7 5 7 5 7 5 7 5 7 5 7 5 7 5 7 5 7 5 7 4 7 5 7 2 7 4 7 end label values country country label def country 2 "A", modify label def country 3 "B", modify label def country 4 "C", modify label def country 5 "D", modify label def country 6 "E", modify label def country 7 "F", modify
Tags: None
Co Ar

Join Date: Sep 2016

Posts: 83
#2

05 Sep 2016, 07:38

I think this isn't the best but in the right direction.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35693
#3

05 Sep 2016, 08:08

slideplot is from SSC, as you are asked to explain (FAQ Advice #12).

This is just a flag that writing a similar command but based on twoway bar not graph bar is on my agenda, but my time scale is at best months.
Wanting a middle category to straddle the axis is a strong reason for a rewrite.

When I do it is more than usually subject to caprice as it will depend on my own work stumbling on an example where that kind of graph seems right.

The variables mentioned in your code are not those in your sample data. No matter, as the structure in the latter is easier to work with.

The remainder of this is irrelevant to you if you really want that specific graph, but may still be of interest to others.

With your data example, I tried tabplot (SJ)

Stata Journal subscribers can see http://www.stata-journal.com/article...article=gr0066

Others can pay USD 11.75 (no royalties go to me!) or see a fairly detailed write-up at http://www.statalist.org/forums/foru...updated-on-ssc but they can and should install the program from the Stata Journal files if interested.

Code:

tabplot attitude country, showval percent(country) yasis subtitle(country distributions)

Let's say that we want to sort on the proportion of 4s and 5s. This is easy with a trick or two:

Code:

gen high = inlist(attitude, 4, 5) if attitude < . egen phigh = mean(high), by(country) egen order = group(phigh country) * search labmask to find files and then install labmask order, values(country) decode tabplot attitude order, showval percent(country) yasis subtitle(country distributions) xtitle("")

We first create an indicator for being 4 or 5. Then we get the proportion of such values by averaging. Then we order the countries on that measure. Just in case there are ties, we need to break those ties.

The one devious trick is to get the values (in fact the value labels!) of country to be the value labels of the new ordering variable. For that labmask (SJ) is a lazy work-around.

Last edited by Nick Cox; 05 Sep 2016, 08:31.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35693
#4

06 Sep 2016, 02:54

The variables mentioned in your code are not those in your sample data.

Incorrect. Sorry about that.
Comment
Uwe Russ

Join Date: Sep 2016

Posts: 2
#5

06 Sep 2016, 04:28

Thanks for the swift answer! I am looking forward for your new command. In the meantime I'll stick to "conventional" stacked bar graphs.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35693
#6

06 Sep 2016, 06:10

On the "small world" principle I note that Naomi and Richard in the paper you cite in #1 acknowledge my sending them references to earlier uses of the graph you want. There is a much longer paper https://www.jstatsoft.org/article/vi...i05/v57i05.pdf

I have a 1939 reference as the earliest I know and a 1933 reference as the earliest I know to the graph in #3. So which is "conventional" is up for grabs.

I will just note that rare categories are hard to spot on any stacked design, especially those with zero frequency.
Comment

Announcement

Diverging stacked bar chart

Comment

Comment

Comment

Comment

Comment