Adjust plot region size to 1 bar; and displaying Umlauts in Graph labels

Adam Reiner

Join Date: Sep 2017
Posts: 12

Adjust plot region size to 1 bar; and displaying Umlauts in Graph labels

07 Aug 2018, 11:23

Dear all,

I couldn't find advice on whether to split up my two questions in 2 posts or put them into one.
I am using Stata 15.0

Here is an dataex example output:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input double(BA3 sex BA8a_alle)
2 1 10
2 2  1
1 2 12
2 2  2
2 1  2
2 1 12
1 1  7
2 1  5
2 2  4
1 2 12
2 1  3
2 2 .a
2 1  5
2 1  3
2 1  7
2 1 11
3 1 .d
4 2 11
1 2  4
4 2 .d
end
label values BA3 BA3
label def BA3 1 "sehr unwahrscheinlich", modify
label def BA3 2 "eher unwahrscheinlich", modify
label def BA3 3 "ungefähr 50 zu 50", modify
label def BA3 4 "eher wahrscheinlich", modify
label values sex sex
label def sex 1 "männlich", modify
label def sex 2 "weiblich", modify
label values BA8a_alle BA8a_alle
label def BA8a_alle 1 "Philosophie/Geschichts-/Kulturwissenschaft", modify
label def BA8a_alle 2 "Sprachwissenschaft", modify
label def BA8a_alle 3 "Politik-/Verwaltungs-/Sozialwissenschaft", modify
label def BA8a_alle 4 "Rechtswissenschaft", modify
label def BA8a_alle 5 "Wirtschaftswissenschaft (BWL/VWL)", modify
label def BA8a_alle 7 "Erziehung/Pädagogik (nicht Lehramt)", modify
label def BA8a_alle 10 "Chemie/Biologie/Ernährung", modify
label def BA8a_alle 11 "Ingenieurswissenschaft/Informatik", modify
label def BA8a_alle 12 "eine andere als die oben gelisteten", modify

So my first problem is that when I coefplot the variable BA3 the one bar only uses up a small part of the plot region (I hope I am using proper vocabulary here), so there is lots of white space on the top and bottom. How can I resize the graph adjusting to that bar? I do want to keep using this command however since I have several catplot commands that generate 5 bars or more - which uses up the space in a satisfying way - and the graphs should look uniform since they are made for a report.
My catplot code is:

Code:

catplot BA3, l1title("") percent asyvar stack blabel(bar, color(black) format(%9.2g) pos(center)) title("Bestehen der Abiturprüfung") ytitle ("Prozent") note("Die Frage lautete im Wortlaut:" "Schätzen Sie ein: Wie wahrscheinlich ist es, dass Sie die Abiturprüfung bestehen werden?",size(3) margin(medlarge)) bar(1, fcolor("37 104 131")) bar(2, fcolor("97 152 48")) bar(3, fcolor("227 1 126")) bar(4, fcolor("106 184 213")) bar(5, fcolor("157 209 109"))

My second problem is with special letters like 'ä', 'ß' in labels of a graph produced by catplot (see code below). When I tabulate BA8a_alle it stata displays them properly. When I run the following Code No.1 it displays them properly. However, when I run following Code No.2 (just added graph play and graph export command) it displays them as questionarks in both the output graph window and the saved .png-file.
Code No.1

Code:

catplot sex BA8a_alle, percent(sex) asyvars ytitle("Percent") legend(label(1 "man") label(2 "women")) title("Idealistische Studienfachwahl") subtitle("") ysize(10) blabel(bar, format(%9.1f)) xsize(15)

Code No.2

Code:

catplot sex BA8a_alle, percent(sex) asyvars ytitle("Percent") legend(label(1 "man") label(2 "women")) title("Idealistische Studienfachwahl") subtitle("") ysize(10) blabel(bar, format(%9.1f)) xsize(15)
graph play catplotrelabelstud
graph export graph11.png, replace

I hope I did no mistakes creating this post (2 issues in one and this [CODE] stuff). Thanks for your help in advance!

Best regards
Adam Reiner

Tags: None

David Radwin

Join Date: Mar 2014

Posts: 368
#2

07 Aug 2018, 13:40

First, as a friendly reminder to you and others, although it is easy to forget that such a useful and widely-used command is user-written and not part of official Stata, catplot was written by Nick Cox and is available from SSC. FAQ 12.1 asks posts to mention the source of all user-written programs. (I have made this same mistake myself on Statalist, so this is not intended as a criticism.)

Re question 1, I am not sure I understand your question, so I will make a guess. If you want to make the bar take up more or less space (be wider or narrower), you might try using the outergap option, such as outergap(*.25) or outergap(*3).

Re question 2, it is not clear what the graph recording catplotrelabelstud is doing to your graph, and I cannot reproduce it without the recording file. But when I created and exported the example graph without playing the file, the PNG file did show the umlaut in the word "Ernährung" as expected.

What happens if you do not play the graph recording? Or can you use a text editor to delete whatever part of the graph recording is affecting the variable labels?

David Radwin
Senior Researcher, California Competes
californiacompetes.org
Pronouns: He/Him
Comment
Adam Reiner

Join Date: Sep 2017

Posts: 12
#3

08 Aug 2018, 01:12

Dear David

Thanks for your quick reply and hints.

About question 1: Sorry that I couldn't describe what I desire more precisely, however your outergap suggestion helps me doing so: Outergap resizes the bar itself in order to adjust those gaps. I seek to resize the gaps themselves, since the bars should be the same size in every graph for the report to look uniform.
Trying the outergap option I realized that I actually need an option to resize those gaps for the coefplot command (written by Ben Jann. also available from SSC), not the catplot command. I got confused about my own code as I tried both commands for some graphs. Coefplot doesn't allow the outergap option anyway. My coefplot Code is:

Code:

local vars BA3 local lblname BA3 local levels 1 2 3 4 5 local nvars: list sizeof vars local nlevels: list sizeof levels matrix p = J(`nlevels', `nvars', .) matrix colnames p = `vars' matrix rownames p = `levels' local i 0 foreach v of local vars { local ++i quietly proportion `v' matrix p[1,`i'] = e(b)' * 100 } matrix r = p mata: st_replacematrix("r", mm_colrunsum(st_matrix("p"))) mata: st_matrix("l", (J(1,`nvars',0) \ st_matrix("r")[1::`nlevels'-1,])) matrix m = r mata: st_replacematrix("m", (st_matrix("l") :+ st_matrix("r"))/2) local plots local i 0 foreach l of local levels { local ++i local lbl: lab `lblname' `l' local plots `plots' (matrix(m[`i']), ci((l[`i'] r[`i'])) aux(p[`i']) key(ci) label(`lbl')) } coefplot `plots', title("Bestehen der Abiturprüfung") plotregion(margin(0 0 0 0)) nooffset ms(i) mlabel(@aux) mlabpos(0) format(%9.0f) coeflabels(, wrap(30)) ciopts(recast(rbar) barwidth(0.08)) legend(col(2) span stack)

About question 2: You are completely right, I felt like I forgot something essential. Running the catplot Code 1 for BA8a_alle, Stata draws a graph with some of the labels being cut off. I reckon it is the 32 characters rule. So I made the graph record catplotrelabelstud to relabel those. So when I don't play the graph record file the labels display all letters properly but are cut off, when I play the record file the labels aren't cut off but display e.g. 'ä' as '?'. So I guess the graph record (.grec-file) not being utf-8-encoded is the problem here? I searched for 'utf' in the help viewer for graph play but couldnt find such an option.

Hope I could describe both issues more thoroughly this time around.
Comment

Adam Reiner

Join Date: Sep 2017
Posts: 12

08 Aug 2018, 07:20

This is by the way the text / code of the grec file:

Code:

StataFileTM:00001:01100:GREC:                          :
00003:00003:00001:
*! classname: hbargraph_g
*! family: bar
*! date: 26 Jul 2018
*! time: 14:11:21
*! graph_scheme: zubab
*! naturallywhite: 1
*! end

// File created by Graph Editor Recorder.
// Edit only if you know what you are doing.

.grpaxis.major.num_rule_ticks = 0
.grpaxis.edit_tick 12 96.2229 `"Philosophie/Geschichts-/Kulturwissenschaft"', tickset(major)
// grpaxis edits

.grpaxis.major.num_rule_ticks = 0
.grpaxis.edit_tick 10 79.4145 `"Politik-/Verwaltungs-/Sozialwissenschaft"', tickset(major)
// grpaxis edits

.grpaxis.major.num_rule_ticks = 0
.grpaxis.edit_tick 8 62.6062 `"Wirtschaftswissenschaft (BWL/VWL)"', tickset(major)
// grpaxis edits

.grpaxis.major.num_rule_ticks = 0
.grpaxis.edit_tick 6 45.7979 `"Erziehung/Pädagogik (nicht Lehramt)"', tickset(major)
// grpaxis edits

.grpaxis.major.num_rule_ticks = 0
.grpaxis.edit_tick 2 12.1813 `"Ingenieurswissenschaft/Informatik"', tickset(major)
// grpaxis edits

.grpaxis.major.num_rule_ticks = 0
.grpaxis.edit_tick 1 3.77715 `"eine andere als die oben gelisteten"', tickset(major)
// grpaxis edits


// <end>

Comment

David Radwin

Join Date: Mar 2014

Posts: 368
#5

08 Aug 2018, 12:42

I do not know why this is the case, but the line ".grpaxis.major.num_rule_ticks = 0" in your graph recording seems to be causing the problem. When I opened catplotrelabelstud.rec in a text editor and deleted every instance of that line with search-and-replace, then ran the ran the graph recording again, it labeled the bars with the text longer than 32 characters and with the umlauts intact.

An equivalent approach is to use the obscure and not documented command gr_edit to change the labels in your do-file, like the following. I just copied the lines from the graph recording. I don't know if there is another way to skip that step and determine the numeric values. So this step may not save you any time, but it does keep all the changes to the graph in the do-file.

Code:

gr_edit .grpaxis.edit_tick 2 16.2879 `"Ingenieurswissenschaft/Informatik"', tickset(major) gr_edit .grpaxis.edit_tick 1 5.05051 `"eine andere als die oben gelisten"', tickset(major) gr_edit .grpaxis.edit_tick 4 38.7626 `"Erziehung/Pädagogik (nicht Lehramt)"', tickset(major) gr_edit .grpaxis.edit_tick 5 50 `"Wirtschaftswissenschaft (BWL/VWL)"', tickset(major) gr_edit .grpaxis.edit_tick 7 72.4747 `"Politik-/Verwaltungs-/Sozialwissenschaft"', tickset(major) gr_edit .grpaxis.edit_tick 9 94.9495 `"Philosophie/Geschichts-/Kulturwissenschaft"', tickset(major)

I regret that I am not sufficiently familiar with coefplot to help further with question 1 about spacing.

David Radwin
Senior Researcher, California Competes
californiacompetes.org
Pronouns: He/Him
Comment
Adam Reiner

Join Date: Sep 2017

Posts: 12
#6

09 Aug 2018, 01:40

Thank you really much, especially for the gr_edit command solution. It is much more elegant. Since I am / will be sharing the do.file with colleagues who don't naturally have the .grec-files in their directory, it'll provide a much smoother workflow.

(For the unlikely case of anyone reproducing this by copying your gr_edit-code and having - like me - those labels doubled on different vertical positions: The position values seem to have changed somewhere down the road. Compare for example tick 1 with 3.77715 vs 5.05051. Just switch them back and the labels look as they're supposed to do.)

Originally posted by David Radwin View Post

I regret that I am not sufficiently familiar with coefplot to help further with question 1 about spacing.

No worries! I'd say you fixed the more crucial issue. Maybe I'm lucky for there to come a second hero helping on that one
Comment

Announcement

Adjust plot region size to 1 bar; and displaying Umlauts in Graph labels

Comment

Comment

Comment

Comment

Comment