Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Plotting categorical variable against mean absolute difference score

    Dear Statalists,

    I'm working on a cross-section study investigating the accuracy of three different equations to calculate low-density lipoprotein (Friedwalds, Martin, and Sampson). The attached figure shows that I want to plot their mean absolute difference (MAD) score against triglyceride intervals.

    My data structures are summarized below.

    Any help would be so appreciated.


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float tg_cat
    1
    1
    1
    1
    1
    1
    1
    1
    1
    1
    end
    Click image for larger version

Name:	Screenshot 2023-06-23 at 02.20.45.png
Views:	1
Size:	558.2 KB
ID:	1718120
    Sincerely regards,
    Abdullah Algarni
    [email protected]

  • #2
    Your data example is a small step in the right direction, but needs to include both variables concerned over a wider range.

    Comment


    • #3
      Hi Nick Cox , here example of my data

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input double tg100 float fmad
      1  .00004161522
      1 .000011597684
      1  .00001023325
      1   .0000122799
      1  .00005184847
      1 .000036157482
      1             0
      1   .0000163732
      1   .0000654928
      1   .0001801052
      2 .000036157482
      1 .000025924233
      2  .00007572605
      1 .000035475266
      1    4.0933e-06
      1  .00007845492
      1   .0000327464
      1  .00007095053
      1 .000023195367
      1 .000015008766
      1   .0000245598
      1  .00003001753
      1   .0000163732
      1   .0000613995
      1   .0000204665
      1 .000019102066
      1  .00008664151
      1  .00005867063
      1  5.457733e-06
      2  4.775517e-06
      1   .0000327464
      2  .00003206418
      1 .000013644333
      2  8.868817e-06
      1   .0000163732
      1  .00004161522
      2  .00010028585
      1  .00004161522
      1  .00002183093
      1   .0000204665
      1   .0000286531
      3  .00007709048
      3  .00005253068
      1 .000035475266
      1 .000023195367
      1    .000040933
      1  .00013712555
      1 .000019102066
      1  .00003001753
      2  .00005525955
      1  .00006003506
      1   .0000163732
      1  .00007981935
      2  .00006412836
      1  .00006685723
      1  .00010096806
      1    4.0933e-06
      1 .000027288666
      2  .00015418096
      1  6.822167e-06
      1  .00007640827
      1 .000015008766
      1  .00019374953
      1 1.3644333e-06
      3  .00012075235
      1 .000025924233
      1 .000072314964
      1   .0000204665
      1  .00010779023
      2  .00010915467
      1  6.822167e-06
      1  .00005457733
      2 .000031381966
      1 .000025924233
      4   .0003533882
      1  9.551033e-06
      1   .0000122799
      1  6.822167e-06
      1   .0000204665
      1   .0000204665
      1  .00005048403
      1  5.457733e-06
      1  .00006276393
      1  .00003820413
      3  .00006753945
      1 1.3644333e-06
      2  .00006822166
      1   .0000613995
      1 .000025924233
      1 .000013644333
      1 .000010915466
      1   .0000122799
      1 2.7288665e-06
      1   .0000122799
      2  .00005048403
      1 .000010915466
      1 .000013644333
      1  .00003411083
      1 1.3644333e-06
      1 .000031381966
      end
      label values tg100 Hundreds
      label def Hundreds 1 "≤100", modify
      label def Hundreds 2 "101–200", modify
      label def Hundreds 3 "201–300", modify
      label def Hundreds 4 "301–400", modify
      Sincerely regards,
      Abdullah Algarni
      [email protected]

      Comment


      • #4
        The data are already binned therefore and so what you're asking for is some variant on

        Code:
        egen mean_fmad = mean(fmad), by(tg100)
        twoway connected mean_fmad tg100, xla(1/4, valuelabel)
        Statistically, I would say that it would be greatly preferable not to bin at all but to fit some smooth summary of each variable as a function of original measurements of TG.

        Comment


        • #5
          I agree 100%, we should not categorize continuous variables
          Sincerely regards,
          Abdullah Algarni
          [email protected]

          Comment


          • #6
            Thank you, Nick Cox, with your help, I got this beautiful figure
            God bless you

            Code:
            twoway (connected mean_fldl tg_cat, sort) (connected mean_mldl tg_cat, sort) (connected mean_sldl tg_cat, sort) (connected mean_dldl tg_cat, sort), xlabel(#13, valuelabel) legend(rows(1) position(6))
            Click image for larger version

Name:	Screenshot 2023-06-29 at 02.36.27.png
Views:	1
Size:	1.27 MB
ID:	1718758

            Last edited by Abdullah Algarni; 28 Jun 2023, 17:52.
            Sincerely regards,
            Abdullah Algarni
            [email protected]

            Comment


            • #7
              Post the data as a data example, and I will make some suggestions which I think you'll find an imporovement.

              Comment


              • #8
                Hi Nick Cox .. I just see your comment. Here is my data example

                Code:
                * Example generated by -dataex-. For more info, type help dataex
                clear
                input float(tg_cat mean_fldl mean_mldl mean_sldl mean_dldl)
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                2 111.14561 113.91785 114.14278  118.4116
                1 101.61765  99.77162 102.18396 104.37002
                2 111.14561 113.91785 114.14278  118.4116
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                2 111.14561 113.91785 114.14278  118.4116
                1 101.61765  99.77162 102.18396 104.37002
                2 111.14561 113.91785 114.14278  118.4116
                1 101.61765  99.77162 102.18396 104.37002
                2 111.14561 113.91785 114.14278  118.4116
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                2 111.14561 113.91785 114.14278  118.4116
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                3  113.0848 123.84882 118.88454 124.88075
                3  113.0848 123.84882 118.88454 124.88075
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                2 111.14561 113.91785 114.14278  118.4116
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                2 111.14561 113.91785 114.14278  118.4116
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                2 111.14561 113.91785 114.14278  118.4116
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                3  113.0848 123.84882 118.88454 124.88075
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                2 111.14561 113.91785 114.14278  118.4116
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                2 111.14561 113.91785 114.14278  118.4116
                1 101.61765  99.77162 102.18396 104.37002
                4 105.50047 125.20518 114.42454 121.68421
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                3  113.0848 123.84882 118.88454 124.88075
                1 101.61765  99.77162 102.18396 104.37002
                2 111.14561 113.91785 114.14278  118.4116
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                2 111.14561 113.91785 114.14278  118.4116
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                1 101.61765  99.77162 102.18396 104.37002
                end
                Sincerely regards,
                Abdullah Algarni
                [email protected]

                Comment


                • #9
                  Thanks for the example so far. but I should have been more explicit. To vary the graph in #6 it is necessary, but also sufficient. to have access to the results of


                  Code:
                  collapse mean_fldl mean_mldl mean_sldl mean_dldl, by(tg_cat) 
                  dataex
                  together with value labels or other explanations of the 13 categories of tg_cat (there are only 12 in #1).

                  Comment


                  • #10
                    Here is the new example
                    Code:
                    * Example generated by -dataex-. For more info, type help dataex
                    clear
                    input float(tg_cat mean_fldl mean_mldl mean_sldl mean_dldl)
                     1  101.61765  99.77162 102.18396 104.37002
                     2  111.14561 113.91785 114.14278  118.4116
                     3   113.0848 123.84882 118.88454 124.88075
                     4  105.50047 125.20518 114.42454 121.68421
                     5   94.55951 127.86067 107.12328  114.4185
                     6   82.48265   121.794   99.4325 108.08202
                     7  65.323204 111.26353  88.37569  99.20166
                     8  64.136086 114.67422  89.32371  99.17525
                     9   45.34655  101.2638  78.81207  86.56896
                    10  36.767742  98.23871  74.96129  88.58064
                    11   31.92174  97.80869  73.41304  76.56522
                    12  17.233334  81.04166  67.14167  85.33334
                    13  22.111765  92.73235   70.3353  90.70588
                    14  -40.35455  46.41818  45.73636 69.818184
                    15  -7.178571  75.00714  60.44286  79.64286
                    16    -45.925   37.9625    45.875    66.125
                    17      -79.4      19.4      39.1        51
                    18  -90.86667 13.466666  39.06667        49
                    19 -35.314285      62.1  51.24286  90.57143
                    20    -142.56    -32.96     33.54        53
                    21  -190.3111 -35.38889  76.42223  61.77778
                    end
                    label values tg_cat TG
                    label def TG 1 "<100", modify
                    label def TG 2 "100–199", modify
                    label def TG 3 "200–299", modify
                    label def TG 4 "300–399", modify
                    label def TG 5 "400–499", modify
                    label def TG 6 "500–599", modify
                    label def TG 7 "600–699", modify
                    label def TG 8 "700–799", modify
                    label def TG 9 "800–899", modify
                    label def TG 10 "900–999", modify
                    label def TG 11 "1000–1099", modify
                    label def TG 12 "1100–1199", modify
                    label def TG 13 "1200–1299", modify
                    label def TG 14 "1300–1399", modify
                    label def TG 15 "1400–1499", modify
                    label def TG 16 "1500–1599", modify
                    label def TG 17 "1600–1699", modify
                    label def TG 18 "1700–1799", modify
                    label def TG 19 "1800–1899", modify
                    label def TG 20 "1900–1999", modify
                    label def TG 21 "≥2000", modify
                    Sincerely regards,
                    Abdullah Algarni
                    [email protected]

                    Comment


                    • #11
                      Thanks for that. Here are my suggestions:

                      Use direct labelling (Amer:labeling) rather than a legend. Getting the text labels the same colour as each connected line is worth the effort.

                      21 axis labels on the x axis would be over the top. As you're using a bin width of 100 (in whatever units) a grid of lines is one way to emphasize the binning. It too could be thought over the top.

                      Better axis titles. Use informative text rather than letting Stata use variable names.

                      Code:
                      * Example generated by -dataex-. For more info, type help dataex
                      clear
                      input float(tg_cat mean_fldl mean_mldl mean_sldl mean_dldl)
                       1  101.61765  99.77162 102.18396 104.37002
                       2  111.14561 113.91785 114.14278  118.4116
                       3   113.0848 123.84882 118.88454 124.88075
                       4  105.50047 125.20518 114.42454 121.68421
                       5   94.55951 127.86067 107.12328  114.4185
                       6   82.48265   121.794   99.4325 108.08202
                       7  65.323204 111.26353  88.37569  99.20166
                       8  64.136086 114.67422  89.32371  99.17525
                       9   45.34655  101.2638  78.81207  86.56896
                      10  36.767742  98.23871  74.96129  88.58064
                      11   31.92174  97.80869  73.41304  76.56522
                      12  17.233334  81.04166  67.14167  85.33334
                      13  22.111765  92.73235   70.3353  90.70588
                      14  -40.35455  46.41818  45.73636 69.818184
                      15  -7.178571  75.00714  60.44286  79.64286
                      16    -45.925   37.9625    45.875    66.125
                      17      -79.4      19.4      39.1        51
                      18  -90.86667 13.466666  39.06667        49
                      19 -35.314285      62.1  51.24286  90.57143
                      20    -142.56    -32.96     33.54        53
                      21  -190.3111 -35.38889  76.42223  61.77778
                      end
                      label values tg_cat TG
                      label def TG 1 "<100", modify
                      label def TG 2 "100–199", modify
                      label def TG 3 "200–299", modify
                      label def TG 4 "300–399", modify
                      label def TG 5 "400–499", modify
                      label def TG 6 "500–599", modify
                      label def TG 7 "600–699", modify
                      label def TG 8 "700–799", modify
                      label def TG 9 "800–899", modify
                      label def TG 10 "900–999", modify
                      label def TG 11 "1000–1099", modify
                      label def TG 12 "1100–1199", modify
                      label def TG 13 "1200–1299", modify
                      label def TG 14 "1300–1399", modify
                      label def TG 15 "1400–1499", modify
                      label def TG 16 "1500–1599", modify
                      label def TG 17 "1600–1699", modify
                      label def TG 18 "1700–1799", modify
                      label def TG 19 "1800–1899", modify
                      label def TG 20 "1900–1999", modify
                      label def TG 21 "≥2000", modify
                      
                      foreach v in fldl mldl sldl dldl { 
                          local `v'_last = mean_`v'[21]
                      }
                      
                      twoway connected mean_fldl mean_mldl mean_sldl mean_dldl tg_cat ///
                      || scatteri `fldl_last' 21 "fldl", ms(none) mlabsize(medlarge) mlabcolor(stc1) ///
                      || scatteri `mldl_last' 21 "mldl", ms(none) mlabsize(medlarge) mlabcolor(stc2) ///
                      || scatteri `sldl_last' 21 "sldl", ms(none) mlabsize(medlarge) mlabcolor(stc3) ///
                      || scatteri `dldl_last' 21 "dldl", ms(none) mlabsize(medlarge) mlabcolor(stc4) legend(off) xsc(r(1 22)) ///
                      xla(0.5 "0" 5.5 "500" 10.5 "1000" 15.5 "1500" 20.5 "2000") xli(0.5(1)21.5, lc(gs12) lw(thin) lp(solid)) /// 
                      ytitle(Mean <add text>) xtitle(TG <add text>) yla(-200(50)150, nogrid)
                      Click image for larger version

Name:	algarni.png
Views:	1
Size:	63.6 KB
ID:	1719179

                      Comment


                      • #12
                        Amazing 😍, thank you so much Nick Cox
                        Sincerely regards,
                        Abdullah Algarni
                        [email protected]

                        Comment

                        Working...
                        X