How can I produce multiple bar graphs? A matrix of bar graphs? Can it be done with catplot? or tabplot?

Nick Cox

Join Date: Mar 2014

Posts: 35646
#16

25 Nov 2015, 07:32

I'd also look at http://www.jstatsoft.org/article/view/v057i05

The nearest Stata equivalent is slideplot (SSC), but it needs to be thrown away so I can start again.

However, you have categories "Could not assess" "Missing" and they don't fit well into such a framework.
Comment
wbuchanan

Join Date: Mar 2014

Posts: 1361
#17

25 Nov 2015, 10:40

Joanna Davies It's a shameless plug at self promotion, but the current development version of brewscheme - in combination with brewtheme - makes it easier to build a scheme file that would contain all of the aesthetic parameters that you may want to modify. So, you could build a scheme file that would handle the size of the text elements, colors, etc... and could then use the same scheme file for other graphs so you would have consistent appearance without having to specify all of the aesthetic parameters for each graph (that was my initial motivation for working on the program). Here's an example of using brewtheme/brewscheme to create a ggplot2 inspired scheme: https://wbuchanan.github.io/brewscheme/brewtheme/
Comment
Joanna Davies

Join Date: Nov 2015

Posts: 57
#18

27 Nov 2015, 03:30

Hi Nick (and others),

Im taking a different approach to the data, I think I was trying to display too much so ive decided to present less and use the proportion of patients in each setting reporting 'moderate', 'severe' or 'overwhelming' symptoms - in a dot graph , code:

set scheme s1color
graph dot (asis) IPU Homecare Ambulatory Other_BCC, over(question) ///
legend(row(1)) ///
linetype(line) lines(lcolor(gs12) lw(vvthin)) ytitle("Percent")

The aim is to compare the 'complexity/symptom severity' of patients in different settings.

What do you think? Interested in any thoughts people have about how clear this is or suggested improvements.

Also a coding query - I would like to change the markers to other symbols, or perhaps to letters representing the settings (I, H, A, O)? I have played around with 'msymbol' and 'marker' but nothing seems to work - any advice on this?

....a further thought - is it possible to put confidence intervals around the dots?

Thank you - all input very much appreciated.
Joanna
Comment
Joanna Davies

Join Date: Nov 2015

Posts: 57
#19

27 Nov 2015, 03:46

Apologies Nick Cox and wbuchanan - I missed your last posts. Thank you for the article, very useful, I have not worked in R before but may make an attempt. And interesting to hear about brewscheme - i'll take a closer look.
Thank you.
Comment
wbuchanan

Join Date: Mar 2014

Posts: 1361
#20

27 Nov 2015, 05:34

Joanna Davies No need to move to R if you don't want to learn a new language. I'm working on some projects that hopefully (and this could be extremely wishful thinking) will provide an API similar to the ggplot2 package in R, but using syntax that is more familiar to Stata users. The brewscheme package is the start of it (and I have most of the things I want in place for a real v 1.0 type release worked out), but with any luck it should be much easier to do more with graphics in the not too distant future. In terms of the updated visualization, it isn't completely clear what the percentage is (e.g., percent responding with a specific value, above/below a threshold, etc...). However, I tend to think dot plots are fairly under utilized. I'd need to think through/test some code before providing any type of example, but if you collapsed the response set into fewer categories (e.g., 3) you may be able to encode the locations with symbol shapes and use color for the collapsed categories; I'd suggest using some type of sequential/gradient based type color palette so the color reenforces the ordinal meaning of the response set if you go this way. If there aren't many extreme cases, you may be able to reduce the x-axis range to avoid potential problems with over plotting/overlapping points. In either case, I think in general it is much cleaner and easier to view. Another possibility would be to create a graph for each location with the points showing the percentage of responses for each response value and then use gr combine to put each of the four locations into a single image.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35646
#21

27 Nov 2015, 07:38

Replying to #17:

You have 17 x 4 numbers there.

Could you please post those summaries to be copied and pasted, or minimally the names of the 17 categories to be copied and pasted (which is no more than you have disclosed already)?

There are many possibilities but experiment and discussion would be immensely easier with a sandbox to play in.
Comment
Joanna Davies

Join Date: Nov 2015

Posts: 57
#22

27 Nov 2015, 08:22

1 "Pain" 2 "SoB" 3 "Weakness" 4 "Nausea" 5 "Vomiting" 6 "Poor Appetite" 7 "Constipation" 8 "Sore Mouth" ///
9 "Drowsiness" 10 "Poor Mobility" 11 "Anxious" 12 "Family worried" 13 "Depressed" 14 "At Peace" 15 "Share feelings" 16 "Information" 17 "Practical matters"

is this what you want Nick? Not sure what you mean by summaries? Sorry!
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35646
#23

27 Nov 2015, 08:28

Ideally I would like the 68 numbers being plotted: they are summaries, i.e. % satisfying joint criteria, and not the raw data.
Comment
Joanna Davies

Join Date: Nov 2015

Posts: 57
#24

27 Nov 2015, 08:35

I see. summaries.dta

thank you
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35646
#25

27 Nov 2015, 09:16

Excellent. Here is one take:

Code:

use summaries.dta, clear set scheme s1color rename (I-O) y= reshape long y, i(question) j(setting) string replace setting = "Other BCC" if setting == "Other_BCC" egen mean = mean(-y), by(question) egen group = group(mean question) * download labmask from SJ labmask group, values(question) decode * download tabplot from SSC tabplot group setting [iw=y] , horizontal /// showval(offset(0.5) mlabcolor(black) format(%2.0f)) barw(0.9) bfcolor(none) /// yla(, noticks) ytitle("") subtitle("percent") xtitle("")

I'd recommend:

Sorting rows on some magnitude measure (I used row means). It might make sense to do this for columns too.

Changing underscores to blanks.

More use of lower case unless essential. SoB, IPU, BCC look like standard terms of art, but otherwise lower case is easier to read and takes up less space. (There are also small inconsistencies in what you have any way.)

Rounding to integer %s is as much as policy people may need. You can give more detail in tables.

(I guess that you may want to compare settings as much as questions, but exchanging rows and columns and making the bars vertical both have downsides.)

Last edited by Nick Cox; 27 Nov 2015, 09:34.
1 like
Comment
Joanna Davies

Join Date: Nov 2015

Posts: 57
#26

27 Nov 2015, 10:03

I like this one, much easier to read and i think it works for comparing symptoms within and between setting. This exchange has been enlightening - thank you!
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35646

#27

27 Nov 2015, 10:19

Having made the plot about as simple as possible, it's possible to think of adding detail. Here, for example, we highlight the setting in which the highest value is observed for each question.

Code:

bysort question (y) : gen highest = _n == _N 
tabplot group setting [iw=y] , horizontal ///
showval(offset(0.5) mlabcolor(black) format(%2.0f)) barw(0.9) ///
yla(, noticks) ytitle("") subtitle("percent") xtitle("") separate(highest) /// 
bar2(bcolor(red*0.8)) bar1(blcolor(blue) bfcolor(none))

Click image for larger version

Name: joanna4.png
Views: 1
Size: 21.5 KB
ID: 1318262

Comment

Joanna Davies

Join Date: Nov 2015

Posts: 57
#28

27 Nov 2015, 10:47

Thanks Nick. Im going to have another play with it next week. This style definitely works well for the data. I'll post what I come up with. Have a lovely weekend
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35646

#29

30 Nov 2015, 06:10

Many people want displays of twoway tables, so even small tricks and tweaks may be of wider interest.

Here's one. Follow table rather than graph convention and put the horizontal axis labels at the top.
(More discussion at http://www.stata-journal.com/sjpdf.h...iclenum=gr0053 if you wish.)

Code:

use summaries.dta, clear
set scheme s1color

rename (I-O) y=
reshape long y, i(question) j(setting) string
http://www.stata-journal.com/sjpdf.html?articlenum=gr0053replace setting = "Other BCC" if setting == "Other_BCC"

egen mean = mean(-y), by(question)
egen group = group(mean question)
* download labmask from SJ
labmask group, values(question) decode

* download tabplot from SSC
tabplot group setting [iw=y] , horizontal ///
showval(offset(0.5) mlabcolor(black) format(%2.0f)) barw(0.9) bfcolor(none) ///
yla(, noticks) ytitle("") subtitle("percent") xtitle("") xsc(alt) xla(, ang(.001))

Click image for larger version

Name: joanna5.png
Views: 1
Size: 22.1 KB
ID: 1318454

What's different in the code? xsc(alt) is the obvious difference. That undoes a little trick embedded in the tabplot code, but xla(, ang(.001)) is an adequate fix.

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment