How can I produce multiple bar graphs? A matrix of bar graphs? Can it be done with catplot? or tabplot?

Joanna Davies

Join Date: Nov 2015

Posts: 57
#1

How can I produce multiple bar graphs? A matrix of bar graphs? Can it be done with catplot? or tabplot?

24 Nov 2015, 10:39

Using Stata SE12

Data: Patient-level health data; patient characteristics and responses to a quality of life questionnaire

I want to produce multiple bar charts displaying the categorical data distribution (proportion of patients in each category) for each item on my questionnaire, and show this separately for patients in 4 different settings. The aim is to provide a visual summary of patient responses to the questions, comparing differences between settings of care - I want to show a lot of information on one page as a visual summary NB: this is not for an academic paper - im reporting to health care teams on the data they have collected.

I'm attaching the graph I have produced using catplot (SSC), code: catplot iposq2pain2_3, by(setting) percent(setting) blabel(bar, format(%4.1f) pos(top))

This is how I want each bar graph to look - but I want multiple items/questions included, by setting (4 bar charts per question) - all within the same graph. Is this possible?

Im also attaching a crude mock up of the graph I ideally want - a cut and paste of the graphs I have produced with catplot. example of ideal chart.docx

One final point - there is an example of a 'matrix of bar graphs' here: http://blog.stata.com/tag/sem/ does anyone know how this was produced?

Thank you

Attached Files
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35791
#2

24 Nov 2015, 10:57

Extending previous private email with Joanna (who approached me as author of catplot (SSC)):

You can get a two-way array of bar charts with a combination of by() and over() or just something like this.

I suspect that the blog graph you report was produced using a graph hbar equivalent of catplot. Here is code you can run:

Code:

clear set obs 1000 egen question = seq(), to(20) block(50) set seed 2803 mata st_addvar("int", "answer") st_store(., "answer", rdiscrete(1000, 1, (0.1, 0.2, 0.4, 0.2, 0.1))) end catplot answer, by(question)

If your questions are separate variables, you need a reshape long first.
Comment
Joanna Davies

Join Date: Nov 2015

Posts: 57
#3

25 Nov 2015, 04:29

Thanks Nick. I have made some progress after reshaping to long. I produced: catplot by question long_reshape.docx using code: catplot ipos_answer, by(question) percent(question) ylabel(none) blabel(bar, format(%4.1f) pos(top))

Running it separately for each setting - but this still doesn't get me quite what I want. Do you know a way to successfully add setting using catplot or another command? Everything I have tried is eligible. The ideal for me is to have four graphs for each question (one for each setting), with each question aligned on a separate row - as in my attached 'example of ideal chart' in above post.

I cant post my actual data but the structure after reshaping is 4 variables: id; setting; question; answer

Thank you for your help.

Best,
Joanna
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35791
#4

25 Nov 2015, 04:36

Please note that Word documents are deprecated here: many people can't (or won't) open them. For example, on the machine I am currently using I can't open that document at all. The point is covered at http://www.statalist.org/forums/help#stata

Does "eligible" mean "illegible"?

I'll add more later when I can read that file. But posting graphs as .png (use attachment icon, not photo icon) would help.
Comment
Joanna Davies

Join Date: Nov 2015

Posts: 57
#5

25 Nov 2015, 05:06

Sorry! I will repost as .png......and yes, I mean illegible. In a meeting for the next hour.
Thank you.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35791
#6

25 Nov 2015, 05:56

As I understand it, you have 10 questions X 4 settings X 7 possible answers. That's 40 bar charts with 7 bars each. In principle, it can be done; in practice it is hard work to make this at all legible without doing it on a poster that few might want to read. This demo understates the problem if anything, as extra text is needed in the real case.

To document what is elsewhere stated: catplot and tabplot are from SSC and must be installed first.

I haven't much explored the scope here for stacked (divided, segmented) bar charts which some might prefer here.

My first attempt is a disaster.

Code:

clear set obs 4000 egen question = seq(), to(10) block(100) egen setting = seq(), to(4) block(1000) set seed 2803 mata st_addvar("int", "answer") st_store(., "answer", rdiscrete(4000, 1, (0.05, 0.1, 0.2, 0.3, 0.2, 0.1, 0.05))) end catplot answer question, by(setting)

My second attempt is not so bad.

Code:

tabplot question answer, by(setting, compact note("")) showval(mlabsize(*.5)) ysize(7)

Last edited by Nick Cox; 25 Nov 2015, 05:58.
Comment
wbuchanan

Join Date: Mar 2014

Posts: 1362
#7

25 Nov 2015, 05:58

Joanna Davies it may be easier to first collapse the data - or use the table command to do the equivalent - to create frequencies for the aggregations of interest then you could visualize the summarized data. There's almost definitely a way to do it from the raw data as well, but if you need to get something turned around quickly it may make it easier for you to aggregate the data first and use the same advice about reshaping/structuring the data that was previously mentioned before graphing.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35791
#8

25 Nov 2015, 06:00

Billy: I think the data structure Joanna has is fine. The problems lie elsewhere in the amount of detail that is of interest.
Comment
wbuchanan

Join Date: Mar 2014

Posts: 1362
#9

25 Nov 2015, 06:00

Nick Cox the tabplot example is really well put together. I can't see a larger version of the image, but the amount of additional strategically placed white space appears to make it much easier to view.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35791
#10

25 Nov 2015, 06:10

The code is self-contained in producing a fake dataset with the same structure, so you can reproduce the graph with your Stata.

The problem of adding informative text remains for the real data.
Comment
Joanna Davies

Join Date: Nov 2015

Posts: 57
#11

25 Nov 2015, 06:43

Thanks for the tabplot code example - I've produced something similar to your second attempt

.....I think the earlier catplot is actually easier to read 'catplot ipos_answer, by(question) percent(question) ylabel(none) blabel(bar, format(%4.1f) pos(top))'

Do you think any other commands might be useful? Or should I give up and look outside of stata? Maybe excel or tableau?

Thank you again for your help.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35791
#12

25 Nov 2015, 07:09

Naturally the catplot for one setting is easier to read than the tabplot for all four. That's not a fair comparison.

In the tabplot the ytitle and xtitle could both be deleted without loss.

In the catplot you don't need all the question labels to be repeated so much. That's probably for the Graph Editor.

I have never used Tableau. I once read two awful books on it, but they are no doubt not to taken as indicting the program. I don't advise using Excel for graphics. Clearly I can't and won't rule out the possibility of something better in other software, but the root difficulty here is trying to squeeze a lot of information on to a single display and remain legible and intelligible. There's no magic bullet for that.

If this were my problem, I would fall back on one graph for each setting and expect to tell people in text what is interesting or surprising. At some point you have to change viewpoint and focus on what the reader will find an easy and effective display.

Unless all your graphics are produced that way, the default blue background for s2color will be a distraction. I switched to s1color some time ago. It doesn't match all my preferences, but whenever I write my own graph scheme I forget what it's called.
Comment
Joanna Davies

Join Date: Nov 2015

Posts: 57
#13

25 Nov 2015, 07:11

I should also say the reason im pursuing this is to develop a template for national reporting, to be repeated across services a number of times per year - which is why im keen that it is automated (no cut and paste)... keen to get the cleanest, simplest solution, and would like to use stata if I can
Comment
Joanna Davies

Join Date: Nov 2015

Posts: 57
#14

25 Nov 2015, 07:18

Thanks Nick.Good point re reducing axis titles and changing colour. I appreciate your help. I'll endeavour to post the final graph on here at some point.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35791
#15

25 Nov 2015, 07:18

We agree: this is Statalist!
Comment

Announcement

How can I produce multiple bar graphs? A matrix of bar graphs? Can it be done with catplot? or tabplot?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment