Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • COVID-19 symptoms graphic


    I was asked off-forum by someone in the medical field if I knew of how to create the graphic below.


    Click image for larger version

Name:	gr1.png
Views:	1
Size:	32.5 KB
ID:	1575764





    The following is a naive approach using twoway. With some effort, much of the code can be generalized if the intention is to produce many such graphs. It is not difficult to read the data from the graph. The number of patients is a continuous variable and one will often use an indicator for the presence of a symptom. So here is the data reconstructed and code.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(patid patients myalgia fatigue sob cough fever)
     1 60 0 1 1 1 1
     2 52 1 1 1 1 1
     3 39 0 0 1 1 1
     4 21 0 0 0 1 1
     5 14 0 1 0 1 1
     6 14 1 0 1 1 1
     7 13 0 0 1 0 1
     8 12 0 1 1 0 1
     9 10 1 1 0 1 1
    10  8 0 0 1 0 0
    11  8 0 1 1 1 0
    12  8 0 0 0 0 1
    13  7 0 0 1 1 0
    14  7 0 1 0 0 1
    15  7 1 1 1 0 1
    16  6 0 1 0 0 0
    17  6 0 1 1 0 0
    18  6 1 1 1 1 0
    19  5 1 0 1 1 0
    20  4 1 0 0 1 1
    21  3 1 1 0 0 0
    22  3 0 1 0 1 0
    23  3 1 0 1 0 1
    24  2 1 0 1 0 0
    25  2 0 0 0 1 0
    26  1 1 1 1 0 0
    27  1 1 0 0 1 0
    28  1 1 0 0 0 1
    29  1 1 1 0 0 1
    end
    
    
    local i 1
    foreach var of varlist myalgia - fever{
        gen N`var'= sum(`var'*patients)
        replace N`var'= (N`var'[_N])
        replace `var'= cond(`var'>0, -`i'+0.5, .)
        local ++i
    }
    forval var= 1/`=`i'-1'{
        gen m`var'=-`var'+0.5
    }
    egen min= rowmin(myalgia-fever)
    egen max= rowmax(myalgia-fever)
    tab min, gen(min)
    gen patients2= patients/5
    
    
    tw (bar patients2 patid, barw(0.7) bcolor(navy))  ///
    (scatter patients2 patid, mc(none) mlab(patients) mlabc(navy) mlabs(vsmall) mlabp(12) ) ///
    (scatter m1-m5 patid, mcolor(gray%20 gray%20 gray%20 gray%20 gray%20) ///
    msize(medium medium medium medium medium) graphregion(color(white))) ///
    (scatter myalgia-fever patid, mcolor(black black black black black) leg(off) ///
    msize(medium medium medium medium medium)   xlab("")  ///
    xtitle("") yscale(lstyle(none)) xscale(lstyle(none)) ylab("" noticks)) ///
    (dropline max patid if min1, base(-4.5) mcolor(none) lcolor(black) lwidth(thick)) ///
    (dropline max patid if min2, base(-3.5) mcolor(none) lcolor(black) lwidth(thick)) ///
    (dropline max patid if min3, base(-2.5) mcolor(none) lcolor(black) lwidth(thick)) ///
    (dropline max patid if min4, base(-1.5) mcolor(none) lcolor(black) lwidth(thick) ///
    text(-0.5 -3.5 "Myalgia", size(vsmall)) text(-1.5 -3.5 "Fatigue/ Malaise", size(vsmall)) ///
    text(-2.5 -3.5 "Shortness of breath", size(vsmall)) ///
    text(-3.5 -3.5 "Cough", size(vsmall)) text(-4.5 -3.5 "Fever", size(vsmall))) ///
    (scatteri -0.5 `=(-Nmyalgia[1]/30)-8' -0.5 -8 , recast(line) lw(vvthick) lc(navy)) ///
    (scatteri -1.5 `=(-Nfatigue[1]/30)-8' -1.5 -8 , recast(line) lw(vvthick) lc(navy)) ///
    (scatteri -2.5 `=(-Nsob[1]/30)-8' -2.5 -8 , recast(line) lw(vvthick) lc(navy)) ///
    (scatteri -3.5 `=(-Ncough[1]/30)-8' -3.5 -8 , recast(line) lw(vvthick) lc(navy)) ///
    (scatteri -4.5  `=(-Nfever[1]/30)-8' -4.5 -8 , recast(line) lw(vvthick) lc(navy)) ///
    (scatteri -0.15 0 14 0, recast(line) lw(medium) lc(black)) ///
    (scatteri -0.05 0 -0.05 30, recast(line) lw(medium) lc(black)) ///
    (scatteri -5 `=(-Nfever[1]/30)-8.5' -5 -7.5, recast(line) lw(medium) lc(black) ///
    text(0 -0.5 "0-", size(small))  text(4 -0.8 "20-", size(small)) ///
    text(8 -0.8 "40-", size(small))  text(12 -0.8 "60-", size(small)) ///
    text(7 -7 "Number of Patients", orient(vert) size(small)) ///
    text(-5.45 -7.9 "0", size(vsmall)) text(-5.1 -8 "-", orient(vert) size(vsmall)) ///
    text(-5.1 `=(((-Nmyalgia[1]/30)/Nmyalgia[1])*100)-8' "-", orient(vert) size(vsmall)) ///
    text(-5.45 `=(((-Nmyalgia[1]/30)/Nmyalgia[1])*100)-7.9' "100", size(vsmall)) ///
    text(-5.1 `=(((-Nmyalgia[1]/30)/Nmyalgia[1])*200)-8' "-", orient(vert) size(vsmall)) ///
    text(-5.45 `=(((-Nmyalgia[1]/30)/Nmyalgia[1])*200)-7.9' "200", size(vsmall)) ///
    text(-5.9 `=(((-Nmyalgia[1]/30)/Nmyalgia[1])*125)-7.9' "Number of Patients", size(vsmall)))
    Res.:
    Click image for larger version

Name:	Graph.png
Views:	1
Size:	32.3 KB
ID:	1575765





  • #2
    Wow, I am again surprised to see how flexible Stata graphs can be. Great work and much to learn from it!
    Best wishes

    (Stata 16.1 MP)

    Comment


    • #3
      just to congratulate Andrew Musau on his work and note that the lit refers to this as an UpSet plot; here are a couple of citations:
      Gehlenberg, LA, et al. (2014), "Upset: visualization of intersecting sets", IEEE Trans Vis Comput Graph, 20(12): 1983-1992
      Ballarini, NM, et al. (2020), "A critical review of graphics for subgroup analyses in clinical trials," Pharmaceutical Statistics, 19: 541-560

      Comment


      • #4
        Thanks Rich for the interesting references.
        Kind regards,
        Carlo
        (StataNow 18.5)

        Comment


        • #5
          Dear Andrew Musau,
          Thanks for sharing the code to create this chart.
          I would like to know if you have the code to plot the bars and dots on the y-axis and the prevalence on the x-axis. Kind regards, Fabíola

          Comment


          • #6
            It's mostly just reversing the order of the scatterplots and flipping the bars using the -horiz- option of twoway bar. But then you introduce issues with vertical text, but here is a start.


            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input float(patid patients myalgia fatigue sob cough fever)
             1 60 0 1 1 1 1
             2 52 1 1 1 1 1
             3 39 0 0 1 1 1
             4 21 0 0 0 1 1
             5 14 0 1 0 1 1
             6 14 1 0 1 1 1
             7 13 0 0 1 0 1
             8 12 0 1 1 0 1
             9 10 1 1 0 1 1
            10  8 0 0 1 0 0
            11  8 0 1 1 1 0
            12  8 0 0 0 0 1
            13  7 0 0 1 1 0
            14  7 0 1 0 0 1
            15  7 1 1 1 0 1
            16  6 0 1 0 0 0
            17  6 0 1 1 0 0
            18  6 1 1 1 1 0
            19  5 1 0 1 1 0
            20  4 1 0 0 1 1
            21  3 1 1 0 0 0
            22  3 0 1 0 1 0
            23  3 1 0 1 0 1
            24  2 1 0 1 0 0
            25  2 0 0 0 1 0
            26  1 1 1 1 0 0
            27  1 1 0 0 1 0
            28  1 1 0 0 0 1
            29  1 1 1 0 0 1
            end
            
            local i 1
            foreach var of varlist myalgia - fever{
                gen N`var'= sum(`var'*patients)
                replace N`var'= (N`var'[_N])
                replace `var'= cond(`var'>0, -`i'+0.5, .)
                local ++i
            }
            forval var= 1/`=`i'-1'{
                gen m`var'=-`var'+0.5
            }
            egen min= rowmin(myalgia-fever)
            egen max= rowmax(myalgia-fever)
            tab min, gen(min)
            gen patients2= patients/5
            
            qui sum patients
            local xend= 1.15*`=r(max)/5'
            local yend= `=_N'+1
            
            tw (bar patients2 patid, barw(0.7) bcolor(navy) horiz) ///
            (scatter patid patients2, mc(none) mlab(patients) ///
            mlabc(navy) mlabs(vsmall) mlabp(3) ysc(r(. `=`yend'+10')off) xsc(off) ///
            msize(medium medium medium medium medium) graphregion(color(white))) ///
            (scatter patid myalgia , mcolor(black) leg(off)) ///
            (scatter patid fatigue , mcolor(black) leg(off)) ///
            (scatter patid sob , mcolor(black) leg(off)) ///
            (scatter patid cough , mcolor(black) leg(off)) ///
            (scatter patid fever , mcolor(black) leg(off)) ///
            (dropline max patid if min1, horiz base(-4.5) mcolor(none) lcolor(black) lwidth(thick)) ///
            (dropline max patid if min2, horiz base(-3.5) mcolor(none) lcolor(black) lwidth(thick)) ///
            (dropline max patid if min3, horiz base(-2.5) mcolor(none) lcolor(black) lwidth(thick)) ///
            (dropline max patid if min4, horiz base(-1.5) mcolor(none) lcolor(black) lwidth(thick)) ///
            (scatteri 0 -0.1 0 `xend', recast(line) lw(medium) lc(black)) ///
            (scatteri 0 0 `yend' 0, recast(line) lw(medium) lc(black) ///
             text(-1.25 `=`xend'/2' "Number of Patients", orient(horiz) size(small)) ///
             text(`=`yend'+5.5' -0.5 "Myalgia", orient(vert) size(small)) ///
             text(`=`yend'+5.5' -1.5 "Fatigue/ Malaise", orient(vert) size(small)) ///
             text(`=`yend'+5.5' -2.5 "Shortness of breath", orient(vert) size(small)) ///
             text(`=`yend'+5.5' -3.5 "Cough", orient(vert) size(small)) ///
             text(`=`yend'+5.5' -4.5 "Fever", orient(vert) size(small)))
            Res.:
            Click image for larger version

Name:	Graph.png
Views:	1
Size:	31.9 KB
ID:	1689213

            Comment


            • #7
              Tim Morris and I have been working on various new commands in this territory. Here are a couple of results of the closest to this thread, called upsetplot, applied to Andrew Musau's interesting example. The command call is second time round, having seen first time over that we needed a little more space to show 60 as a text label.

              Naturally the ideal is that one line of code gets you most of the way. There are as usual many options, fairly rich text output, and scope to save the calculated results as a new dataset.

              Code:
              * Example generated by -dataex-. To install: ssc install dataex
              clear
              input float(patid patients myalgia fatigue sob cough fever)
               1 60 0 1 1 1 1
               2 52 1 1 1 1 1
               3 39 0 0 1 1 1
               4 21 0 0 0 1 1
               5 14 0 1 0 1 1
               6 14 1 0 1 1 1
               7 13 0 0 1 0 1
               8 12 0 1 1 0 1
               9 10 1 1 0 1 1
              10  8 0 0 1 0 0
              11  8 0 1 1 1 0
              12  8 0 0 0 0 1
              13  7 0 0 1 1 0
              14  7 0 1 0 0 1
              15  7 1 1 1 0 1
              16  6 0 1 0 0 0
              17  6 0 1 1 0 0
              18  6 1 1 1 1 0
              19  5 1 0 1 1 0
              20  4 1 0 0 1 1
              21  3 1 1 0 0 0
              22  3 0 1 0 1 0
              23  3 1 0 1 0 1
              24  2 1 0 1 0 0
              25  2 0 0 0 1 0
              26  1 1 1 1 0 0
              27  1 1 0 0 1 0
              28  1 1 0 0 0 1
              29  1 1 1 0 0 1
              end
              
              set scheme s1color 
              
              label var myalgia "Myalgia"
              label var fatigue "Fatigue/ Malaise"
              label var sob "Shortness of breath"
              label var cough "Cough"
              label var fever "Fever"
              
              upsetplot myalgia-fever [fw=patients], baropts(fcolor(blue*0.3) lcolor(blue)) ysc(r(. 63  )) name(UP1, replace)
              
              upsetplot myalgia-fever [fw=patients], baropts(fcolor(blue*0.3) lcolor(blue)) ysc(r(. 63  )) varlabels name(UP2, replace)
              Click image for larger version

Name:	UP1.png
Views:	1
Size:	22.0 KB
ID:	1689219
              Click image for larger version

Name:	UP2.png
Views:	1
Size:	17.9 KB
ID:	1689220

              Comment


              • #8
                Dear @Andrew Musau,

                Thank you very much for helping me with the code.

                Best regards,

                Fabíola

                Comment


                • #9
                  Dear @Nick Cox,

                  Thank you very much for your response. I would like o know if the command is already available, as I didn't find it. I'm using Stata 17.

                  Best regards,

                  Fabíola

                  Comment


                  • #10
                    Thanks for your interest but the code is not yet public.

                    Comment


                    • #11
                      Many users must be waiting to see Upset and others like alluvial, bump, lollipop charts natively in Stata!

                      Comment


                      • #12
                        If by native you mean what is usually called official, namely whatever is distributed by StataCorp, that is a question for the company.

                        upsetplot is I think likely to be distributed real soon now via SSC. It is just a wrapper for twoway, so programmable by any user.

                        Alluvial: see Asjad Naqvi alluvial on SSC and GitHub.

                        Bump: see parplot on SSC.

                        Lollipop is straightforward with minor pre-processing, if this fits the bill.


                        Code:
                        sysuse auto, clear
                        gsort -foreign -mpg
                        gen order = _n
                        * labmask from Stata Journal 
                        labmask order, values(make)
                        twoway dropline mpg order in 1/20 , horizontal ysc(reverse) yla(1/20, valuelabel noticks ang(h)) ytitle("") base(0) scheme(s1color)
                        Click image for larger version

Name:	lollipop.png
Views:	1
Size:	51.5 KB
ID:	1695758

                        Comment


                        • #13
                          See also https://www.stata-journal.com/articl...article=gr0034 from 2008 on what are here called lollipop charts and various alternatives.

                          Comment

                          Working...
                          X