Formatting Bar graph

Olubunmi Adebiyi

Join Date: Apr 2022
Posts: 12

Formatting Bar graph

09 May 2023, 10:23

Hi everyone,
Is there any way to make this bar graph more readable? 'A', 'B', and 'C' have small percentages while D has a very high percentage, making A, B, and C too small to read.
Is there any way to drop bar 'D' and expand the scale on the y-axis since I am primarily interested in the trend of A, B, and C? Or which other way can you advise?
Thank you for the anticipated response.

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float(state1 TrendData America year)
. 1 4 2
. 1 4 2
. 1 4 2
. 1 2 2
. 1 3 2
. 1 4 2
. 1 . 2
. 1 4 2
. 1 . 2
. 1 . 2
. 1 . 2
. 1 4 2
. 1 4 2
. 1 4 2
. 1 4 2
. 1 4 2
. 1 . 2
. 1 4 2
. 1 . 2
. 1 4 2
. 1 4 2
. 1 4 2
. 1 . 2
. 1 . 2
. 1 4 2
. 1 4 2
. 1 4 2
. 1 4 2
. 1 . 2
. 1 . 2
. 1 4 2
. 1 1 2
. 1 4 2
. 1 4 2
. 1 1 2
. 1 3 2
. 1 4 2
. 1 4 2
. 1 . 2
. 1 4 2
. 1 4 2
. 1 4 2
. 1 4 2
. 1 4 2
. 1 4 2
. 1 4 2
. 1 4 2
. 1 4 2
. 1 4 2
1 1 4 2
. 1 4 2
. 1 4 2
. 1 . 2
. 1 4 2
. 1 4 2
. 1 4 2
. 1 4 2
. 1 4 2
. 1 . 2
. 1 4 2
. 1 4 2
1 1 4 2
. 1 4 2
. 1 4 2
. 1 . 2
. 1 4 2
. 1 4 2
. 1 4 2
1 1 . 2
. 1 4 2
. 1 4 2
. 1 4 2
. 1 4 2
. 1 4 2
. 1 . 2
. 1 4 2
. 1 4 2
. 1 . 2
. 1 4 2
. 1 . 2
. 1 4 2
. 1 . 2
. 1 4 2
. 1 . 2
. 1 4 2
. 1 4 2
. 1 4 2
. 1 3 2
. 1 4 2
. 1 4 2
. 1 . 2
. 1 4 2
. 1 4 2
. 1 4 2
. 1 1 2
. 1 4 2
. 1 3 2
. 1 4 2
. 1 . 2
. 1 4 2
end
label values America DevelopmentalCategories
label def DevelopmentalCategories 1 "A", modify
label def DevelopmentalCategories 2 "B", modify
label def DevelopmentalCategories 3 "C", modify
label def DevelopmentalCategories 4 "D", modify

Click image for larger version

Name: State1.png
Views: 0
Size: 0
ID: 1712958

Last edited by Olubunmi Adebiyi; 09 May 2023, 10:30.

Tags: None

Olubunmi Adebiyi

Join Date: Apr 2022

Posts: 12
#2

09 May 2023, 10:34

This is the bar graph
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10287

09 May 2023, 12:59

To exclude a category, use the -if- qualifier, e.g.,

Code:

gr bar (sum) TrendData if America!=4, over(America)

Is there any way to make this bar graph more readable? 'A', 'B', and 'C' have small percentages while D has a very high percentage, making A, B, and C too small to read.

In such cases, you may consider a transformation that accommodates both 0 and positive values and then relabel the y-axis. Cube roots are one example. See https://journals.sagepub.com/doi/pdf...867X0800800113 for some discussion on plotting on transformed scales.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float America double TrendData
1  3
2  2
3  4
4 69
end
label values America DevelopmentalCategories
label def DevelopmentalCategories 1 "A", modify
label def DevelopmentalCategories 2 "B", modify
label def DevelopmentalCategories 3 "C", modify
label def DevelopmentalCategories 4 "D", modify

gen T3= TrendData^(1/3)
set scheme s1mono
gr bar TrendData, over(America)  ytitle("") saving(gr1, replace)
gr bar T3, over(America) ytitle("") saving(gr2, replace)  ///
ylab(0 "0"  `=1^(1/3)' "1" `=2^(1/3)' "2" `=5^(1/3)' "5" `=10^(1/3)' "10" ///
`=20^(1/3)' "20" `=40^(1/3)' "40" `=60^(1/3)' "60" `=80^(1/3)' "80")

gr combine gr1.gph gr2.gph

Click image for larger version

Name: Graph.png
Views: 1
Size: 16.7 KB
ID: 1712981

Last edited by Andrew Musau; 09 May 2023, 13:10.

Comment

Nick Cox

Join Date: Mar 2014

Posts: 35783
#4

09 May 2023, 16:15

#1 and #2 don't show the code for the bar chart. But by eye the values appear to sum to about 100, across all categories A B C D and years. It seems more likely that percentages are better calculated separately for each year. True or not, the heights of the bars are here taken to be proportional to the amounts to show, and if that's wrong any guess that the values should be rescaled wouldn't, I think, imply a different graph from what I am going to suggest. .

Here I used tabplot from the Stata Journal

SJ-22-2 gr0066_3 . . . . . . . . . . . . . . . . Software update for tabplot
(help tabplot if installed) . . . . . . . . . . . . . . . . N. J. Cox
Q2/22 SJ 22(2):467
bug fixed; help file updated to include further references

SJ-20-3 gr0066_2 . . . . . . . . . . . . . . . . Software update for tabplot
(help tabplot if installed) . . . . . . . . . . . . . . . . N. J. Cox
Q3/20 SJ 20(3):757--758
added new options frame() and frameopts() allowing framing
of bars and so-called thermometer plots or charts

SJ-17-3 gr0066_1 . . . . . . . . . . . . . . . . Software update for tabplot
(help tabplot if installed) . . . . . . . . . . . . . . . . N. J. Cox
Q3/17 SJ 17(3):779
added options for reversing axis scales; improved handling of
axis labels containing quotation marks

SJ-16-2 gr0066 . . . . . . Speaking Stata: Multiple bar charts in table form
(help tabplot if installed) . . . . . . . . . . . . . . . . N. J. Cox
Q2/16 SJ 16(2):491--510
provides multiple bar charts in table form representing
contingency tables for one, two, or three categorical variables

and suggest that although the amounts for D are clearly much larger than other categories, they don't dominate the graph unduly. Naturally I agree with Andrew Musau that you have a choice to omit D from the graph. On this occasion cube roots are likely to seem puzzling to most potential readers, useful though they can be.

A detail with tabplot is that you can easily show the values themselves, producing a kind of hybrid table and plot, hence the name.

Naturally you can reach and specify different colours, and so forth.

Code:

* Example generated by -dataex-. For more info, type help dataex clear input byte year str1 which float outcome 1 "A" 1.5 2 "A" 3 3 "A" 1.5 4 "A" 0 1 "B" .5 2 "B" 1.5 3 "B" 1.5 4 "B" .5 1 "C" 1.5 2 "C" 3 3 "C" 4 4 "C" 1.5 1 "D" 22 2 "D" 28 3 "D" 15 4 "D" 15 end tabplot which year [iw=outcome] , showval xla(, nogrid)
1 like
Comment
Olubunmi Adebiyi

Join Date: Apr 2022

Posts: 12
#5

09 May 2023, 16:51

Thank you so much, professors Nick Cox and Andrew Musau. While both tabplot and transformation scale are great ways to present this data, I think omitting 'D' will be much easier to read by my readers. My only concern is whether omitting " D " affects the individual percentages of A, B, and C.

I am sorry I didn't include the code for the graph initially :

graph bar (percent) TrendData, over(America) asyvars over(year) bar(1, color(red)) bar(2, color(blue)) bar(3, color(black)) legend(rows(1))
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35783
#6

09 May 2023, 17:08

As guessed, with your code, percents are calculated from the total of all values across both over() arguments.

If some other calculation makes more sense, you need different syntax.
Comment

Announcement

Formatting Bar graph

Comment

Comment

Comment

Comment

Comment