Clustered Bar chart with N values bewlo x-axis in each cluster

Kim Vaarts

Join Date: May 2025

Posts: 21
#16

18 May 2025, 19:37

Originally posted by Clyde Schechter View Post

The problem is that with the number of lines and drugs you have, the number of bars is large enough that you can't really accommodate all those percentages at the bars ends without them overlapping. All this requires is that you specify a smaller size. You want to find one that fits gracefully in the available space, but is still large enough to read. Try this:

Code:

catplot , over(drug) over(line) percent(line) blabel(bar, size(vsmall) format(%2.1f)) asyvars name(D1, replace) catplot , over(drug) over(line) percent(line) blabel(bar, size(vsmall) format(%2.1f)) asyvars recast(bar) legend(row(1) pos(12)) name(D2, replace)

If that's not a good size, you can go larger or smaller by choosing from among the sizes you will find by running -help textsizestyle-.

Thank you very much! I appreaciate all your help.
Comment
Kim Vaarts

Join Date: May 2025

Posts: 21
#17

18 May 2025, 20:12

Originally posted by Clyde Schechter View Post

So, I think I understand what your data looks like. Run this and take a look in the data browser to see if this resembles your data set in the relevant respects:

Code:

* Example generated by -dataex-. For more info, type help dataex clear input float id str1(line1 line2 line3) 1 "" "D" "C" 2 "C" "B" "" 3 "C" "B" "D" 4 "D" "" "C" 5 "A" "" "C" 6 "" "" "C" 7 "" "D" "B" 8 "B" "A" "D" 9 "" "A" "D" 10 "B" "A" "" 11 "D" "A" "B" 12 "A" "C" "D" 13 "A" "D" "" 14 "A" "" "C" 15 "C" "" "D" 16 "C" "" "B" 17 "C" "B" "" 18 "A" "C" "B" 19 "C" "" "D" 20 "C" "" "B" 21 "" "C" "D" 22 "" "B" "A" 23 "C" "D" "B" 24 "C" "B" "A" 25 "" "C" "D" 26 "B" "C" "D" 27 "D" "A" "" 28 "A" "D" "C" 29 "A" "C" "" 30 "A" "" "D" 31 "D" "C" "" 32 "" "D" "B" 33 "D" "C" "A" 34 "D" "C" "B" 35 "C" "B" "A" 36 "C" "" "A" 37 "A" "D" "C" 38 "D" "C" "A" 39 "" "C" "A" 40 "A" "" "B" 41 "D" "" "" 42 "C" "A" "B" 43 "C" "A" "D" 44 "D" "C" "B" 45 "D" "B" "" 46 "D" "B" "" 47 "C" "B" "D" 48 "" "" "A" 49 "D" "C" "A" 50 "A" "D" "C" end

Note that I am assuming that your data set includes some kind of id variable, perhaps a patient MRN, and that that variable uniquely identifies observations in your data set. If you do not have such a variable, and have only the line1, line2, and line3 variables, then you need to create one, which you can easily do just with:

Code:

gen `c(obs_t)' id = _n

Assuming that we are now on the same page about what your data looks like, the best solution is to transform your data set so that it looks like what Nick created in #6. Then we can apply Nick's original solution to that:

Code:

capture set scheme stcolor // Nick already suggested -reshape-; I'm just giving explicit code here. rename line* _drug* reshape long _drug, i(id) j(line) encode _drug, gen(drug) drop _drug // From here down it's Nick's original code with just one tiny tweak. quietly forval j = 1/3 { count if drug < . & line == `j' local which = word("first second third", `j') label def line `j' `" "`which' line" "{it:n} = `r(N)'" "', add } label val line line catplot , over(drug) over(line) percent(line) blabel(bar, format(%2.1f)) asyvars name(D1, replace) catplot , over(drug) over(line) percent(line) blabel(bar, format(%2.1f)) asyvars recast(bar) legend(row(1) pos(12)) name(D2, replace)

I have eliminated the -ssc install catplot- command because you have clearly already done that, and there is no reason to do it again.

I will add that the original organization of your data, with three line variables that, I suppose, are drug names, is not conducive to analysis in Stata. It is not just a matter of this particular graphing problem. It is a matter of Stata working better with data in long layout rather than wide for almost everything. It is likely that whatever other analysis of this data you plan, it will be facilitated by using this revised data organization. To avoid having to re-create it each time, I suggest you actually -save- it as a new data set after you have used it for this purpose.

Dear Clyde, I cannot change local which = word("first second third fourth fifth sixth seventh eight ninth", `j') to:
local which = word("L1 L2 L3 L4 L5 L6 L7 L8 L9", `j'). It does not change in STATA. Do I need to change "word" to something else? I tried "number" but STATA gave an error.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30089
#18

18 May 2025, 21:10

I have difficulty imagining why you would be unable to make that change. You say Stata "gave an error." Do you mean an error message? After which command did the error message arise? And what exactly did it say? And was that only after you tried changing -word- to -number-? If that wait, there's no surprise because Stata doesn't have a function called -number()-.

Here's the one thing I can think of that you might have done wrong. You changed word("first second third", `j') to word("L1 L2 L3 L4 L5 L6 L7 L8 L9", `j'). That's fine. But "first second third" has only 3 words, whereas "L1 L2 L3 L4 L5 L6 L7 L8 L9" has 9. So you also have to change that -quietly forval j = 1/3 {- line to -quietly forval j = 1/9 {-. If you didn't do that, then the first three groups of bars will be labeled L1, L2, and L3, but the rest will be just labeled with numbers 4 through 9.

By the way, just be forewarned that with 9 lines, and, it seems, 6 drugs, you are putting 54 bars on the graph. That's a very crowded graph. I fear the -size(vsmall)- may be too large to fit over those bars. So you will need to experiment with still smaller sizes until you find one that works. Remember, you can find complete information about specifying values of -size()- by running -help textsizestyle-.
1 like
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35683
#19

19 May 2025, 02:47

Returning to this, I have a bundle of comments on various levels.

Thanks naturally to Clyde Schechter for taking over while I was asleep!

In Statalist it is rarely necessary to copy the entirety of a previous post in replying to it. Everyone who can see the thread can see the previous post.

Confidentiality of data is something we readily accept and respect, but vague or incorrect signals about the real problem don't help at all. Here it may seem that whether the problem dimensions are 5 x 5 as first implied or 3 x 4 or 9 x 6 is not important but as signalled by Clyde, you're in real danger of crossing a threshold where the graph design breaks down through crowding and sheer inability to show information intelligibly in a friendly manner.

So, here now is another fabricated dataset with 9 lines and 6 drugs. Naturally there is no structure to see in the graphs as the underlying probabilities are identical. But the main point is that you need to move to another design.

tabplot is from the Stata Journal and must be installed before you can use it. A search here using tabplot as key word will find examples of its use.

Code:

clear set obs 100 set seed 314159 gen id = _n forval j = 1/9 { gen line`j' = word("A B C D E F", runiformint(1,6)) if runiform() < 0.8 } capture set scheme stcolor rename line* _drug* reshape long _drug, i(id) j(line) encode _drug, gen(drug) forval j = 1/9 { count if line == `j' & drug < . label def line `j' `" "L`j'" "{it:n =} `r(N)'" "', add } label val line line catplot , over(drug) over(line) percent(line) blabel(bar, format(%2.1f)) recast(bar) asyvars legend(row(1) pos(12)) name(bad) tabplot drug line, percent(line) showval separate(drug) name(better1) tabplot line drug, percent(line) showval separate(drug) name(better2)

Last, and for once least, there was a hint earlier, underlining our longstanding request, to read please the FAQ Advice at https://www.statalist.org/forums/help, all the way down to #18!
1 like
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35683
#20

19 May 2025, 08:05

All that said, sometimes this kind of data works quite well recast as a line chart. Lines 1 to 9 after all look like a serial order. Presumably drugs can be ordered some way other than alphabetical name order, e.g. by kind or strength of effect.

Last edited by Nick Cox; 19 May 2025, 08:57.
Comment
Kim Vaarts

Join Date: May 2025

Posts: 21
#21

19 May 2025, 15:32

Thank you for all your help. I will take all your advice and tips. Sorry for all the messages and inconvencience. This is my second time on this forum asking a question I did not read the rules. I will keep it in mind.
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment