Plotting categorical variable against mean absolute difference score

Abdullah Algarni

Join Date: Jul 2022

Posts: 66
#1

Plotting categorical variable against mean absolute difference score

22 Jun 2023, 17:22

Dear Statalists,

I'm working on a cross-section study investigating the accuracy of three different equations to calculate low-density lipoprotein (Friedwalds, Martin, and Sampson). The attached figure shows that I want to plot their mean absolute difference (MAD) score against triglyceride intervals.

My data structures are summarized below.

Any help would be so appreciated.

Code:

* Example generated by -dataex-. For more info, type help dataex clear input float tg_cat 1 1 1 1 1 1 1 1 1 1 end

Sincerely regards,
Abdullah Algarni
[email protected]
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35780
#2

22 Jun 2023, 23:24

Your data example is a small step in the right direction, but needs to include both variables concerned over a wider range.
1 like
Comment

Abdullah Algarni

Join Date: Jul 2022
Posts: 66

23 Jun 2023, 03:30

Hi Nick Cox , here example of my data

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input double tg100 float fmad
1  .00004161522
1 .000011597684
1  .00001023325
1   .0000122799
1  .00005184847
1 .000036157482
1             0
1   .0000163732
1   .0000654928
1   .0001801052
2 .000036157482
1 .000025924233
2  .00007572605
1 .000035475266
1    4.0933e-06
1  .00007845492
1   .0000327464
1  .00007095053
1 .000023195367
1 .000015008766
1   .0000245598
1  .00003001753
1   .0000163732
1   .0000613995
1   .0000204665
1 .000019102066
1  .00008664151
1  .00005867063
1  5.457733e-06
2  4.775517e-06
1   .0000327464
2  .00003206418
1 .000013644333
2  8.868817e-06
1   .0000163732
1  .00004161522
2  .00010028585
1  .00004161522
1  .00002183093
1   .0000204665
1   .0000286531
3  .00007709048
3  .00005253068
1 .000035475266
1 .000023195367
1    .000040933
1  .00013712555
1 .000019102066
1  .00003001753
2  .00005525955
1  .00006003506
1   .0000163732
1  .00007981935
2  .00006412836
1  .00006685723
1  .00010096806
1    4.0933e-06
1 .000027288666
2  .00015418096
1  6.822167e-06
1  .00007640827
1 .000015008766
1  .00019374953
1 1.3644333e-06
3  .00012075235
1 .000025924233
1 .000072314964
1   .0000204665
1  .00010779023
2  .00010915467
1  6.822167e-06
1  .00005457733
2 .000031381966
1 .000025924233
4   .0003533882
1  9.551033e-06
1   .0000122799
1  6.822167e-06
1   .0000204665
1   .0000204665
1  .00005048403
1  5.457733e-06
1  .00006276393
1  .00003820413
3  .00006753945
1 1.3644333e-06
2  .00006822166
1   .0000613995
1 .000025924233
1 .000013644333
1 .000010915466
1   .0000122799
1 2.7288665e-06
1   .0000122799
2  .00005048403
1 .000010915466
1 .000013644333
1  .00003411083
1 1.3644333e-06
1 .000031381966
end
label values tg100 Hundreds
label def Hundreds 1 "≤100", modify
label def Hundreds 2 "101–200", modify
label def Hundreds 3 "201–300", modify
label def Hundreds 4 "301–400", modify

Sincerely regards,
Abdullah Algarni
[email protected]

Comment

Nick Cox

Join Date: Mar 2014

Posts: 35780
#4

23 Jun 2023, 06:46

The data are already binned therefore and so what you're asking for is some variant on

Code:

egen mean_fmad = mean(fmad), by(tg100) twoway connected mean_fmad tg100, xla(1/4, valuelabel)

Statistically, I would say that it would be greatly preferable not to bin at all but to fit some smooth summary of each variable as a function of original measurements of TG.
Comment
Abdullah Algarni

Join Date: Jul 2022

Posts: 66
#5

24 Jun 2023, 23:13

I agree 100%, we should not categorize continuous variables

Sincerely regards,
Abdullah Algarni
[email protected]
Comment
Abdullah Algarni

Join Date: Jul 2022

Posts: 66
#6

28 Jun 2023, 17:40

Thank you, Nick Cox, with your help, I got this beautiful figure
God bless you

Code:

twoway (connected mean_fldl tg_cat, sort) (connected mean_mldl tg_cat, sort) (connected mean_sldl tg_cat, sort) (connected mean_dldl tg_cat, sort), xlabel(#13, valuelabel) legend(rows(1) position(6))

Last edited by Abdullah Algarni; 28 Jun 2023, 17:52.

Sincerely regards,
Abdullah Algarni
[email protected]
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35780
#7

29 Jun 2023, 00:28

Post the data as a data example, and I will make some suggestions which I think you'll find an imporovement.
Comment

Abdullah Algarni

Join Date: Jul 2022
Posts: 66

01 Jul 2023, 17:48

Hi Nick Cox .. I just see your comment. Here is my data example

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float(tg_cat mean_fldl mean_mldl mean_sldl mean_dldl)
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
2 111.14561 113.91785 114.14278  118.4116
1 101.61765  99.77162 102.18396 104.37002
2 111.14561 113.91785 114.14278  118.4116
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
2 111.14561 113.91785 114.14278  118.4116
1 101.61765  99.77162 102.18396 104.37002
2 111.14561 113.91785 114.14278  118.4116
1 101.61765  99.77162 102.18396 104.37002
2 111.14561 113.91785 114.14278  118.4116
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
2 111.14561 113.91785 114.14278  118.4116
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
3  113.0848 123.84882 118.88454 124.88075
3  113.0848 123.84882 118.88454 124.88075
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
2 111.14561 113.91785 114.14278  118.4116
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
2 111.14561 113.91785 114.14278  118.4116
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
2 111.14561 113.91785 114.14278  118.4116
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
3  113.0848 123.84882 118.88454 124.88075
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
2 111.14561 113.91785 114.14278  118.4116
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
2 111.14561 113.91785 114.14278  118.4116
1 101.61765  99.77162 102.18396 104.37002
4 105.50047 125.20518 114.42454 121.68421
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
3  113.0848 123.84882 118.88454 124.88075
1 101.61765  99.77162 102.18396 104.37002
2 111.14561 113.91785 114.14278  118.4116
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
2 111.14561 113.91785 114.14278  118.4116
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
1 101.61765  99.77162 102.18396 104.37002
end

Sincerely regards,
Abdullah Algarni
[email protected]

Comment

Nick Cox

Join Date: Mar 2014

Posts: 35780
#9

02 Jul 2023, 02:31

Thanks for the example so far. but I should have been more explicit. To vary the graph in #6 it is necessary, but also sufficient. to have access to the results of

Code:

collapse mean_fldl mean_mldl mean_sldl mean_dldl, by(tg_cat) dataex

together with value labels or other explanations of the 13 categories of tg_cat (there are only 12 in #1).
Comment

Abdullah Algarni

Join Date: Jul 2022
Posts: 66

#10

03 Jul 2023, 01:13

Here is the new example

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float(tg_cat mean_fldl mean_mldl mean_sldl mean_dldl)
 1  101.61765  99.77162 102.18396 104.37002
 2  111.14561 113.91785 114.14278  118.4116
 3   113.0848 123.84882 118.88454 124.88075
 4  105.50047 125.20518 114.42454 121.68421
 5   94.55951 127.86067 107.12328  114.4185
 6   82.48265   121.794   99.4325 108.08202
 7  65.323204 111.26353  88.37569  99.20166
 8  64.136086 114.67422  89.32371  99.17525
 9   45.34655  101.2638  78.81207  86.56896
10  36.767742  98.23871  74.96129  88.58064
11   31.92174  97.80869  73.41304  76.56522
12  17.233334  81.04166  67.14167  85.33334
13  22.111765  92.73235   70.3353  90.70588
14  -40.35455  46.41818  45.73636 69.818184
15  -7.178571  75.00714  60.44286  79.64286
16    -45.925   37.9625    45.875    66.125
17      -79.4      19.4      39.1        51
18  -90.86667 13.466666  39.06667        49
19 -35.314285      62.1  51.24286  90.57143
20    -142.56    -32.96     33.54        53
21  -190.3111 -35.38889  76.42223  61.77778
end
label values tg_cat TG
label def TG 1 "<100", modify
label def TG 2 "100–199", modify
label def TG 3 "200–299", modify
label def TG 4 "300–399", modify
label def TG 5 "400–499", modify
label def TG 6 "500–599", modify
label def TG 7 "600–699", modify
label def TG 8 "700–799", modify
label def TG 9 "800–899", modify
label def TG 10 "900–999", modify
label def TG 11 "1000–1099", modify
label def TG 12 "1100–1199", modify
label def TG 13 "1200–1299", modify
label def TG 14 "1300–1399", modify
label def TG 15 "1400–1499", modify
label def TG 16 "1500–1599", modify
label def TG 17 "1600–1699", modify
label def TG 18 "1700–1799", modify
label def TG 19 "1800–1899", modify
label def TG 20 "1900–1999", modify
label def TG 21 "≥2000", modify

Sincerely regards,
Abdullah Algarni
[email protected]

Comment

Nick Cox

Join Date: Mar 2014
Posts: 35780

#11

03 Jul 2023, 02:22

Thanks for that. Here are my suggestions:

Use direct labelling (Amer:labeling) rather than a legend. Getting the text labels the same colour as each connected line is worth the effort.

21 axis labels on the x axis would be over the top. As you're using a bin width of 100 (in whatever units) a grid of lines is one way to emphasize the binning. It too could be thought over the top.

Better axis titles. Use informative text rather than letting Stata use variable names.

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float(tg_cat mean_fldl mean_mldl mean_sldl mean_dldl)
 1  101.61765  99.77162 102.18396 104.37002
 2  111.14561 113.91785 114.14278  118.4116
 3   113.0848 123.84882 118.88454 124.88075
 4  105.50047 125.20518 114.42454 121.68421
 5   94.55951 127.86067 107.12328  114.4185
 6   82.48265   121.794   99.4325 108.08202
 7  65.323204 111.26353  88.37569  99.20166
 8  64.136086 114.67422  89.32371  99.17525
 9   45.34655  101.2638  78.81207  86.56896
10  36.767742  98.23871  74.96129  88.58064
11   31.92174  97.80869  73.41304  76.56522
12  17.233334  81.04166  67.14167  85.33334
13  22.111765  92.73235   70.3353  90.70588
14  -40.35455  46.41818  45.73636 69.818184
15  -7.178571  75.00714  60.44286  79.64286
16    -45.925   37.9625    45.875    66.125
17      -79.4      19.4      39.1        51
18  -90.86667 13.466666  39.06667        49
19 -35.314285      62.1  51.24286  90.57143
20    -142.56    -32.96     33.54        53
21  -190.3111 -35.38889  76.42223  61.77778
end
label values tg_cat TG
label def TG 1 "<100", modify
label def TG 2 "100–199", modify
label def TG 3 "200–299", modify
label def TG 4 "300–399", modify
label def TG 5 "400–499", modify
label def TG 6 "500–599", modify
label def TG 7 "600–699", modify
label def TG 8 "700–799", modify
label def TG 9 "800–899", modify
label def TG 10 "900–999", modify
label def TG 11 "1000–1099", modify
label def TG 12 "1100–1199", modify
label def TG 13 "1200–1299", modify
label def TG 14 "1300–1399", modify
label def TG 15 "1400–1499", modify
label def TG 16 "1500–1599", modify
label def TG 17 "1600–1699", modify
label def TG 18 "1700–1799", modify
label def TG 19 "1800–1899", modify
label def TG 20 "1900–1999", modify
label def TG 21 "≥2000", modify

foreach v in fldl mldl sldl dldl { 
    local `v'_last = mean_`v'[21]
}

twoway connected mean_fldl mean_mldl mean_sldl mean_dldl tg_cat ///
|| scatteri `fldl_last' 21 "fldl", ms(none) mlabsize(medlarge) mlabcolor(stc1) ///
|| scatteri `mldl_last' 21 "mldl", ms(none) mlabsize(medlarge) mlabcolor(stc2) ///
|| scatteri `sldl_last' 21 "sldl", ms(none) mlabsize(medlarge) mlabcolor(stc3) ///
|| scatteri `dldl_last' 21 "dldl", ms(none) mlabsize(medlarge) mlabcolor(stc4) legend(off) xsc(r(1 22)) ///
xla(0.5 "0" 5.5 "500" 10.5 "1000" 15.5 "1500" 20.5 "2000") xli(0.5(1)21.5, lc(gs12) lw(thin) lp(solid)) /// 
ytitle(Mean <add text>) xtitle(TG <add text>) yla(-200(50)150, nogrid)

Click image for larger version

Name: algarni.png
Views: 1
Size: 63.6 KB
ID: 1719179

Comment

Abdullah Algarni

Join Date: Jul 2022

Posts: 66
#12

03 Jul 2023, 12:42

Amazing 😍, thank you so much Nick Cox

Sincerely regards,
Abdullah Algarni
[email protected]
Comment

Announcement