Bland altman plot with shaded confidence interval range using twoway

Annabelle Coleman

Join Date: May 2023
Posts: 44

Bland altman plot with shaded confidence interval range using twoway

23 Aug 2023, 03:27

Hello Statalist users,

I am trying to draw a bland Altman plot with a shaded region between two defined upper and lower confidence interval lines. I am having trouble adding the shaded region into the graph. the code I am currently using is below (without any shading region code):

Code:

            gen diff=NfLCap_P_0 - NfLCap_S_0
                sum diff, d
                    local diffmean : display %5.3f r(mean)
                    local diffsd : display %5.3f r(sd)
                    local upper = r(mean) + 1.96 * r(sd)
                    local lower = r(mean) - 1.96 * r(sd)
            
            gen mean1=(NfLCap_P_0 + NfLCap_S_0)/2
            
        
        
        twoway (scatter diff mean1 if disease_grp==0 , `scattercircle' mcolor("`INF_grey'")) ///
        (scatter diff mean1 if disease_grp==1 , `scattercircle' mcolor("`INF_Red_Light'")) ///
        (scatter diff mean1 if disease_grp==2 , `scattercircle' mcolor("`INF_Red'")), ///
        yline(0, lc("`INF_Blue'") lpattern(dot) lwidth(medium)) ///
        yline(`diffmean', lc("`INF_Blue_Light'") lwidth(medium)) ///
        yline(`upper', lc("`INF_Blue_Light'") lwidth(medium)) ///
        yline(`lower', lc("`INF_Blue_Light'") lwidth(medium)) ///
        legend(off) aspectratio(1) ///
        xlabel(0(20)80) xscale(range(0 80)) ///
        ylabel(-20(5)20) yscale(range(-20 20)) ///
        text(20 0 "Mean = `diffmean'", place(e)) text(18 0 "SD = `diffsd'", place(e)) ///
        ytitle("Difference, NfLCap_P_0 - NfLCap_S_0") xtitle("Mean, NfLCap_P_0 NfLCap_S_0")
        
        gr_edit .plotregion1.style.editstyle boxstyle(linestyle(color(none))) editcopy

I have attached below what the outputted graph looks like. If anyone could help me find the code to add the shaded region into the defined horizontal confidence interval area that would be very helpful.

I look forward to your responses.

Annabelle

Click image for larger version

Name: Screenshot 2023-08-23 at 10.25.21.png
Views: 1
Size: 299.6 KB
ID: 1724758

Attached Files

Screenshot 2023-08-23 at 10.25.21.png (0, 0 views)

Last edited by Annabelle Coleman; 23 Aug 2023, 03:30.

Tags: None

Nick Cox

Join Date: Mar 2014

Posts: 35504
#2

23 Aug 2023, 03:44

Data example please.
Comment

Annabelle Coleman

Join Date: May 2023
Posts: 44

23 Aug 2023, 04:00

Hello Nick,

I have used the Auto data as an example

Code:

use http://www.stata-press.com/data/r13/auto,clear

//price and wight

gen diff = price - weight
sum diff, d
    local diffmean : display %5.3f r(mean)
    local diffsd : display %5.3f r(sd)
    local upper = r(mean) + 1.96 * r(sd)
    local lower = r(mean) - 1.96 * r(sd)
    
    gen mean1=(price + weight)/2
    
    twoway (scatter diff mean1, mcolor(blue)), ///
        yline(0, lc(blue) lpattern(dash) lwidth(medium)) ///
        yline(`diffmean', lc(black) lwidth(medium)) ///
        yline(`upper', lc(black) lwidth(medium)) ///
        yline(`lower', lc(black) lwidth(medium)) ///
        legend(off) aspectratio(1) ///
        xlabel(2000(2000)10000) xscale(range(2000 10000)) ///
        ylabel(-12000(4000)12000) yscale(range(-12000 12000)) ///
        text(12000 2000 "Mean = `diffmean'", place(e)) text(10000 2000 "SD = `diffsd'", place(e)) ///
        ytitle("Difference, Price - Weight") xtitle("Mean, price weight")
        
        gr_edit .plotregion1.style.editstyle boxstyle(linestyle(color(none))) editcopy

This generates this graph below.

I would like a shaded region between the upper and lower confidence interval lines.

Click image for larger version

Name: Screenshot 2023-08-23 at 10.58.39.png
Views: 1
Size: 278.7 KB
ID: 1724762

Comment

Nick Cox

Join Date: Mar 2014

Posts: 35504
#4

23 Aug 2023, 04:55

Thanks for that example.

To get a shaded area, I think you need two new variables to feed to twoway rarea. And you must lay that down first, as otherwise it occludes many data points. But then that occludes added horizontal lines. A way around that in turn is to plot added horizontal lines using twoway function.

Code:

sysuse auto, clear //price and weight gen diff = price - weight sum diff, d local diffmean : display %5.3f r(mean) local diffsd : display %5.3f r(sd) gen upper = r(mean) + 1.96 * r(sd) gen lower = r(mean) - 1.96 * r(sd) gen mean1=(price + weight)/2 gen X= cond(_n == 1, 1800, 10500) twoway rarea lower upper X, sort color(gs12*0.2) plotregion(margin(l=0 r=0)) /// || function `diffmean' + 0*x, lc(black) lwidth(medium) range(1800 10500) /// || function 0 + 0*x, lc(blue) lpattern(dash) lwidth(medium) range(1800 10500) /// || scatter diff mean1, mcolor(blue) /// legend(off) aspectratio(1) /// xlabel(2000(2000)10000) xscale(range(2000 10000)) /// ylabel(-12000(4000)12000) yscale(range(-12000 12000)) /// text(12000 2000 "Mean = `diffmean'", place(e)) text(10000 2000 "SD = `diffsd'", place(e)) /// ytitle("Difference, Price - Weight") xtitle("Mean, price weight")

The terminology Bland-Altman plot seems to have become a standard in medical statistics. As usual, I presume others introduced that term, and Bland and Altman certainly deserved some credit because they pushed a particular approach clearly and vigorously in several papers.

As usual in such cases, they didn't invent plotting difference versus mean, and didn't claim to. I had a conversation with Doug Altman about this some years back.

It's fun (to some) to try to trace the method back. It's a special case of residual versus fitted, used by Neyman and co-authors in the 1950s and made standard by Anscombe and Tukey by 1963. I know of uses by Bliss in 1967, Oldham in 1968 and Hills in 1974. I have read that Thiele used residual plots in the 19th century, but I have never seen quite what that meant.
Comment
Annabelle Coleman

Join Date: May 2023

Posts: 44
#5

23 Aug 2023, 05:10

Hello Nick,

Thank you for your help with the code - it works well which is amazing. I will try adding horizontal plotted lines to see if they will go over the shaded area.

For me to be able to apply this to data that I have been using, could you explain what the code below is for and where the numbers are from:

Code:

gen X= cond(_n == 1, 1800, 10500)

I tried using the rarea code before and I think I was going wrong with using locals for upper and lower rather than a new generated variable!

Very interesting about the name of the plot, it seems to have stuck as bland-altman in medical research. That must've been a very interesting conversation!
Comment

Annabelle Coleman

Join Date: May 2023
Posts: 44

23 Aug 2023, 07:35

Hello Nick,

I have been working on the intensity of the shaded area for this code and I would like to remove the dark blue lines on the outside outlining the shaded area so the outline is the same as the colour of the shaded area.

The code I am using is

Code:

    
gen diff = price - weight
        sum diff, d
        local diffmean : display %5.3f r(mean)
        local diffsd : display %5.3f r(sd)

        gen upper = r(mean) + 1.96 * r(sd)
        gen lower = r(mean) - 1.96 * r(sd)
   
        gen mean1=(price + weight)/2
        gen X= cond(_n == 1, 1800, 10500)
   
    twoway rarea lower upper X, sort color(blue) fintensity(inten20) plotregion(margin(l=0 r=0)) ///
    || function `diffmean' + 0*x, lc(blue) lwidth(medium) range(1800 10500)     ///
    || function 0 + 0*x, lc(blue) lpattern(dash) lwidth(medium) range(1800 10500) ///
       || scatter diff mean1, mcolor(blue) ///
       legend(off) aspectratio(1) ///
        xlabel(2000(2000)10000) xscale(range(2000 10000)) ///
        ylabel(-12000(4000)12000) yscale(range(-12000 12000)) ///
        text(12000 2000 "Mean = `diffmean'", place(e)) text(10000 2000 "SD = `diffsd'", place(e)) ///
        ytitle("Difference, Price - Weight") xtitle("Mean, price weight")
        
        
    gr_edit .plotregion1.style.editstyle boxstyle(linestyle(color(none))) editcopy

Example of the plot is below:

Click image for larger version

Name: Screenshot 2023-08-23 at 14.13.36.png
Views: 1
Size: 290.7 KB
ID: 1724784

I have been trying to use the lstyle(none) option but it removes the shaded area completely.

Comment

Nick Cox

Join Date: Mar 2014

Posts: 35504
#7

23 Aug 2023, 08:56

#5

For me to be able to apply this to data that I have been using, could you explain what the code below is for and where the numbers are from:
Code:
gen X= cond(_n == 1, 1800, 10500)

You need a horizontal axis variable with twoway rarea for the shaded section. It is necessary (and sufficient) for it to include two distinct values that cover the range of the data, and some more. I would automate this in terms of the minimum and maximum of the variable, and a bit extra on each side.

I tried using the rarea code before and I think I was going wrong with using locals for upper and lower rather than a new generated variable!

twoway rarea needs variable names. An alternative would be to use twoway scatteri with the coordinates of the four corners and recast(area).

Very interesting about the name of the plot, it seems to have stuck as bland-altman in medical research. That must've been a very interesting conversation!

Doug Altman https://en.wikipedia.org/wiki/Doug_Altman was a nice guy. I met him at a Stata meeting in London.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35504
#8

23 Aug 2023, 09:03

#6 Try lwidth(none) to remove the line.
Comment

Announcement

Bland altman plot with shaded confidence interval range using twoway

Comment

Comment

Comment

Comment

Comment

Comment

Comment