Multiple line labels on by graph issues with special characters

Simon Turner

Join Date: Oct 2018
Posts: 29

Multiple line labels on by graph issues with special characters

28 Sep 2022, 20:47

Hi Stata experts,
I've been working on a graph in Stata and following the recent Stata tip I've been using by rather than graph combine.
It's certainly much easier to organise the axis labels the way I'd like so thanks Nick Cox for the tips using by graphs!
One issue I currently have probably has more to do with string manipulation rather than the graphing tool and I'd highly value any assistance!

I have a "double" by graph and would like the labels to look a bit like:

Click image for larger version

Name: two_line_by_label.png
Views: 1
Size: 60.3 KB
ID: 1683697

But with the Beta being the greek symbol and the 2 being subscripted.
When I make that change I get:

Click image for larger version

Name: two_line_by_label_symbol_error.png
Views: 1
Size: 59.3 KB
ID: 1683698

Notice how the beta is on the next line but the ₂= 0 on the original first line?
I've tried a few different ideas but can't quite get it to work!
In the "real" code most of the variables are programatically entered in loops etc. and the data is far more interesting, but these show what I'm trying to attempt.

Any help would be great (ideally I wanted the time values to be just given once, which I could do with the graph combine, but then the axis labels were harder to arrange!)

Cheers,
Simon.

The code below generates this random number sample for testing...

Code:

////////////////////////////////////////////////////////////////////////////////
// Having difficulty getting a by graph with multiple lines in the label 
// along with "special" characters such as Greek letter beta and a subscript

// clear things
clear
graph drop _all
set obs 1000

// generate some random data
gen id = _n
gen rmse = rnormal(1,1)
gen beta_2_true = mod(id,3)
gen beta_3_true = mod(int((id-1)/3),3)
gen aggregate_to = mod(id,5)

// get the order of the daily, weekly,...
label define aggregate_label 0 "daily" 1 "weekly" 2 "monthly" 3 "quarterly" 4 "yearly"
label values aggregate_to aggregate_label

// collapse the data ready to graph
collapse (mean) graph_parameter = rmse , by(aggregate_to beta_2_true beta_3_true)

//////////////
// playing with different options for trying to get the time label followed by the beta label on separate lines
*gen beta_2_lab = "{&beta}" + "{sub:2}=" + string(beta_2_true) // works all on same line
*gen beta_2_lab = "`=char(13)'`=char(10)'Beta2=" + string(beta_2_true) // works on separate lines without fancy symbols/subscripts
gen beta_2_lab = "`=char(13)'`=char(10)'" + " {&beta}" + " {sub:2}=" + string(beta_2_true) // sub works but Beta symbol is on a separate line?
*gen beta_2_lab = "`=char(13)'`=char(10)'Beta{sub:2}=" + string(beta_2_true) // sub works but Beta text is on a separate line?

// combine the time and beta to a single group
egen group_lab = group( aggregate_to beta_2_lab), label

twoway scatter graph_parameter beta_3_true, ///
    xlab(0(1)2) xtitle("slope change ({&beta}{sub:3})") ///
    ytitle("Root Mean Square Error (RMSE)", size(small)) ///
    by(group_lab , row(1) note("")) ///
    xsize(10)

Tags: None

Andrew Musau

Join Date: Oct 2014
Posts: 10294

29 Sep 2022, 05:22

The use of char(13) + char(10) as a linebreak in a label is nonstandard, although I do suggest it as a workaround in https://www.statalist.org/forums/for...ariable-labels following a post by Masaru Nagashima. Therefore, you cannot blame Stata for behaving differently when you combine different SMCL characters as nowhere in the documentation do StataCorp show this to be a valid method for introducing linebreaks. That said, using Unicode characters is one way to go.

Code:

////////////////////////////////////////////////////////////////////////////////
// Having difficulty getting a by graph with multiple lines in the label
// along with "special" characters such as Greek letter beta and a subscript

// clear things
clear
graph drop _all
set obs 1000

// generate some random data
gen id = _n
gen rmse = rnormal(1,1)
gen beta_2_true = mod(id,3)
gen beta_3_true = mod(int((id-1)/3),3)
gen aggregate_to = mod(id,5)

// get the order of the daily, weekly,...
label define aggregate_label 0 "daily" 1 "weekly" 2 "monthly" 3 "quarterly" 4 "yearly"
label values aggregate_to aggregate_label

// collapse the data ready to graph
collapse (mean) graph_parameter = rmse , by(aggregate_to beta_2_true beta_3_true)

//////////////
// playing with different options for trying to get the time label followed by the beta label on separate lines
*gen beta_2_lab = "{&beta}" + "{sub:2}=" + string(beta_2_true) // works all on same line
*gen beta_2_lab = "`=char(13)'`=char(10)'Beta2=" + string(beta_2_true) // works on separate lines without fancy symbols/subscripts
gen beta_2_lab = "`=char(13)'`=char(10)'" + " {&beta}`=ustrunescape("\u2082")'=1" + string(beta_2_true) 
*gen beta_2_lab = "`=char(13)'`=char(10)'Beta{sub:2}=" + string(beta_2_true) // sub works but Beta text is on a separate line?

// combine the time and beta to a single group
egen group_lab = group( aggregate_to beta_2_lab), label

set scheme s1mono
twoway scatter graph_parameter beta_3_true, ///
    xlab(0(1)2) xtitle("slope change ({&beta}{sub:3})") ///
    ytitle("Root Mean Square Error (RMSE)", size(small)) ///
    by(group_lab , row(1) note("")) ///
    xsize(10)

Click image for larger version

Name: Graph.png
Views: 1
Size: 55.6 KB
ID: 1683731

Last edited by Andrew Musau; 29 Sep 2022, 05:26.

Comment

Simon Turner

Join Date: Oct 2018

Posts: 29
#3

29 Sep 2022, 18:44

Originally posted by Andrew Musau View Post

The use of char(13) + char(10) as a linebreak in a label is nonstandard, although I do suggest it as a workaround in https://www.statalist.org/forums/for...ariable-labels following a post by Masaru Nagashima. Therefore, you cannot blame Stata for behaving differently when you combine different SMCL characters as nowhere in the documentation do StataCorp show this to be a valid method for introducing linebreaks. That said, using Unicode characters is one way to go.

That's great! Thanks Andrew!
I'd have been happy using any standard technique (but couldn't find one) so had to try going non-standard =)
Crossing fingers for more graphics options/overhauls in the next version of Stata, but very grateful for all the folks here who often know sneaky workarounds!
Cheers,
Simon.
Comment

Announcement

Multiple line labels on by graph issues with special characters

Comment

Comment