Wrong bar labels displayed when using 'Label with y variable name' option under 'Bars' in the bar graph menu

Marcus Brodeur

Join Date: Mar 2015

Posts: 3
#1

Wrong bar labels displayed when using 'Label with y variable name' option under 'Bars' in the bar graph menu

30 Mar 2015, 11:32

Hello, everyone. I had a search both here and elsewhere on the web for an answer and am not coming up with anything.

Basically, does anyone have any idea why, when creating a set of grouped clustered horizontal bar graphs (that is, where under 'Categories' there is is a variable specified for 'Group 1' and a second variable specified for 'Group 2'), the first set of bars to render has all of its individual bar labels displayed correctly, but all of the remaining sets of bars have the wrong labels displayed next to each bar? (It seems to be completely random which of the 'Group 1' variable names it assigns to each of the bars in subsequent bar clusters.)

It's a shame as displaying the associated category for each bar in the cluster right next to the appropriate bar would be very useful as I have each cluster of bars arranged in order of descending length, so otherwise it's a matter of looking back and forth between the chart legend in order to check the colour of each bar (which are correct) with the colour in the legend.

Thanks in advance for any suggestions on this!
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35722
#2

30 Mar 2015, 12:35

There is no syntax or graph here to discuss. In short, I don't think anyone can guess whether you are talking about (1) a bug in Stata (2) a misfeature that can be worked around (3) some consequence of misunderstanding Stata's logic. As bar charts are quite frequently produced by many users (1) would have to be esoteric, or so I guess.

You need a reproducible example, please, either using one of Stata's standard datasets or using data that you give in your post.

Please re-read the FAQ Advice to see comments on asking questions.
Comment

Marcus Brodeur

Join Date: Mar 2015
Posts: 3

01 Apr 2015, 10:12

My apologies, Nick; you're quite right.

Okay, let's try this again. I'm using Stata 13.0 for Mac on the following (vastly abridged) data set, containing 3 variables (name, score and question, the first and last being string type and the middle float). There are four missing values under score, but my understanding is that this shouldn't break the labelling.

name	score	question
Mary	.	Question 1
Nerys	3.16	Question 1
Phyllis	3.64	Question 1
Piotr	.	Question 1
Pleni	3.24	Question 1
Raymond	2.98	Question 1
Sid	3.63	Question 1
Velma	3.93	Question 1
Walter	3.39	Question 1
Mary	3.62	Question 2
Nerys	3.24	Question 2
Phyllis	3.48	Question 2
Piotr	.	Question 2
Pleni	3.22	Question 2
Raymond	2.9	Question 2
Sid	3.46	Question 2
Velma	.	Question 2
Walter	3.29	Question 2

When I use the following syntax to generate a grouped clustered bar graph with individual labels...

. graph hbar (asis) score, over(name, sort(score) descending axis(off)) over(question) ascategory asyvars blabel(name) ylabel(.5(.5)4.5) scale(.4)

...I get the attached image as output.

As you can see, in the lower (presumably first-rendered) clustered bar graph, the bar labels are correctly displayed next to the score corresponding to each participant. However, in the upper clustered bar graph, whilst the bar is of the correct length for the data and the bars themselves are coloured correctly according to the legend at the bottom, the names being displayed next to each bar are mostly incorrect (e.g., the highest score for Question 1 belonged to Velma, but is being labeled 'Mary').

I don't doubt that this is user error on my part; I'm just trying to work out what I'm doing wrong here as the above syntax is as close as I've been able to get to what I would like to achieve with these graphs. It would be perfect if the correct labels accompanied the clustered bar graphs above the lowest one.

Again, thanks in advance to anyone who might be able to shed some light on this issue!

1 Photo

Comment

Nick Cox

Join Date: Mar 2014
Posts: 35722

01 Apr 2015, 11:18

You do indeed have puzzling results. My wild guess is that Stata is getting confused because

Code:

 
ascategory asyvars

are sending conflicting signals, but it's Stata's fault that the graph is mixed up.

That said, the graph seems to me very confusing in design, nor will it improve at all with more data. The code below reproduces the main issue and I add a graph that I suggest instead. Note the use of CODE mark-up to show code and the use not of photo attachments but of attached files. Both points are covered in the FAQ Advice.

Code:

clear 

input str7 name    score question     
Mary    .    1
Nerys    3.16     1
Phyllis    3.64     1
Piotr    .     1
Pleni    3.24     1
Raymond    2.98     1
Sid    3.63     1
Velma    3.93     1
Walter    3.39     1
Mary    3.62     2
Nerys    3.24     2
Phyllis    3.48     2
Piotr    .     2
Pleni    3.22     2
Raymond    2.9     2
Sid    3.46     2
Velma    .     2
Walter    3.29     2
end 

* Marcus original 
graph hbar (asis) score, ///
over(name, sort(score) descending axis(off)) ///
over(question) ascategory asyvars ///
blabel(name) ylabel(.5(.5)4.5) scale(.4)

* NJC suggestion 
graph dot (asis) score, over(question) ///
over(name, sort(score) descending) asyvars ///
exclude0 yla(2.5(0.5)4) marker(1, ms(Oh) msize(*1.5)) ///
marker(2, ms(plus) msize(*1.5)) linetype(line) lines(lc(gs12) lw(vthin))

Click image for larger version

Name: brodeur0.png
Views: 2
Size: 11.6 KB
ID: 1235260

Click image for larger version

Name: brodeur1.png
Views: 1
Size: 10.4 KB
ID: 1235261

Attached Files

Comment

Marcus Brodeur

Join Date: Mar 2015

Posts: 3
#5

05 Apr 2015, 01:25

Hello again, Nick.

I just wanted to say thanks again for taking the time to have a look at this, even though we haven't quite sussed out how to get Stata to label those bar graphs correctly.

I appreciate your suggestion for an alternative presentation, but unfortunately the original (unabridged) data set aims to compare responses to 8 (or more) related questions per combined graph. (I cut it back to 2 questions just to quickly demonstrate the mislabelling issue.) I fear that using 8 different plot markers in each horizontal strip would be visually confusing, and moreover once a fixed ordering of students is applied on the y-axis, it won't be possible to sort responses by descending score for each question.

Sorry for missing out on the CODE mark-up earlier. I did try to attach files to my post and the website kept refusing to do this for some reason. In the end I assumed it was because I was a new user and perhaps there was some safety system in place that limited what new users could do until they had made a sufficient number of posts to establish themselves as genuine users and not spammers, so I used the attach photo option to let you see the problem I was experiencing with Stata.

If anyone else out there can come up with a syntax that convinces Stata to appropriately label the clustered bar graphs using the small set of sample data above, I'll gladly give it a try!
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35722
#6

05 Apr 2015, 02:56

Code:

graph hbar (asis) score, /// over(name, sort(score) descending axis(off)) /// by(question) asyvars nofill /// blabel(name) ylabel(.5(.5)4.5) scale(.4)

may be closer to what you want. Keeping this legible and intelligible if you want 8 bars for each person, not 2, looks like a tough call to me.
Comment

Announcement