It is with many thanks to Kit Baum that I introduce upset_plot, a new command for creating UpSet plots in Stata (version 18.5 or higher).
UpSet plots, first described by Lex et al., provide an alternative to Venn and Euler diagrams for visualizing intersections between multiple sets[1]. Intersections are represented by a binary membership grid, where each column corresponds to a distinct intersection pattern. The frequency of each intersection is displayed in a vertically-aligned bar chart, while a horizontally-aligned bar chart displays the sizes of the individual sets.
Many of you will be famiiliar with Tim Morris and Nick Cox's excellent upsetplot command, which offers similar functionality. The main difference between the two is upset_plot's horizontal set-size bar chart and it's ability to stack bars over categorical variables. The display of this chart can be suppressed if desired, though users seeking an intersection-only display may be better served by Tim and Nick's upsetplot. A link to the original upsetplot discussion thread on Statalist is given below.
https://www.statalist.org/forums/for...lable-from-ssc
A few examples using real-world data are given below. The help file includes several additional examples demonstrating the available customization options.
Example 1
In out first example, we visualize data from a Venn diagram published by Emmons et al. in the form of an UpSet plot[2]. Emmons' Venn diagram showed the shared microbial genera across bone, soil, and gut samples taken human remains.

Example 2
We do the same with a Venn diagram from Jaun et al.'s 2025 paper on clinical remission in severe asthma[3].
The diagram in question, Figure 2, shows two Venn diagrams describing which combinations of four remission criteria were met among patients who did and, separately, did not receive biologic therapies.
Here, we can make use of upset_plot's over() option to present a single, combined UpSet plot that stacks bars over biologic receipt.

Example 3
Perhaps the most famous and enduring example of a Venn diagram is that by D'Hont et al., a six-way behemoth with some... aptly shaped groups[4]. A similarly whimsical diagram was published by Beale et al. a couple of years later[5].
Here, we use D'Hont's data. With a few scaling tweaks to accommodate the considerable number of distinct intersection patterns (referred to as sequence clusters in the context of the study), we can summarize the information in a far more readable, if less charming, UpSet plot.

You can add additional elements to the graph via the addplot() option, although the scope for this is limited to immediate commands (e.g., twoway scatteri) and those that do not rely the underlying data (e.g., twoway function). Behind the scenes, upset_plot significantly rescales the data, so any added elements will likewise need to be rescaled – the details of this are briefly discussed in the help file.
An example is given below in the spirit of D'Hont's Venn diagram.

References
UpSet plots, first described by Lex et al., provide an alternative to Venn and Euler diagrams for visualizing intersections between multiple sets[1]. Intersections are represented by a binary membership grid, where each column corresponds to a distinct intersection pattern. The frequency of each intersection is displayed in a vertically-aligned bar chart, while a horizontally-aligned bar chart displays the sizes of the individual sets.
Many of you will be famiiliar with Tim Morris and Nick Cox's excellent upsetplot command, which offers similar functionality. The main difference between the two is upset_plot's horizontal set-size bar chart and it's ability to stack bars over categorical variables. The display of this chart can be suppressed if desired, though users seeking an intersection-only display may be better served by Tim and Nick's upsetplot. A link to the original upsetplot discussion thread on Statalist is given below.
https://www.statalist.org/forums/for...lable-from-ssc
A few examples using real-world data are given below. The help file includes several additional examples demonstrating the available customization options.
Example 1
In out first example, we visualize data from a Venn diagram published by Emmons et al. in the form of an UpSet plot[2]. Emmons' Venn diagram showed the shared microbial genera across bone, soil, and gut samples taken human remains.
Code:
clear input byte(sb s hg gbc gbab) float freq 1 1 1 1 1 9 1 1 1 1 0 0 1 1 1 0 1 0 1 1 1 0 0 1 1 1 0 1 1 250 1 1 0 1 0 30 1 1 0 0 1 7 1 1 0 0 0 67 1 0 1 1 1 25 1 0 1 1 0 1 1 0 1 0 1 6 1 0 1 0 0 10 1 0 0 1 1 344 1 0 0 1 0 69 1 0 0 0 1 46 1 0 0 0 0 357 0 1 1 1 1 0 0 1 1 1 0 0 0 1 1 0 1 0 0 1 1 0 0 0 0 1 0 1 1 22 0 1 0 1 0 20 0 1 0 0 1 3 0 1 0 0 0 164 0 0 1 1 1 5 0 0 1 1 0 0 0 0 1 0 1 2 0 0 1 0 0 107 0 0 0 1 1 107 0 0 0 1 0 89 0 0 0 0 1 28 0 0 0 0 0 66 end label var sb "Surface bone" label var s "Soil" label var hg "Human gut" label var gbc "Grave bone (C)" label var gbab "Grave bone (A/B)" upset_plot sb s hg gbc gbab[fw=freq], set(gap(0.35))
Example 2
We do the same with a Venn diagram from Jaun et al.'s 2025 paper on clinical remission in severe asthma[3].
The diagram in question, Figure 2, shows two Venn diagrams describing which combinations of four remission criteria were met among patients who did and, separately, did not receive biologic therapies.
Here, we can make use of upset_plot's over() option to present a single, combined UpSet plot that stacks bars over biologic receipt.
Code:
clear input byte(group a e f o) float freq 1 1 1 1 1 6 1 1 1 1 0 1 1 1 1 0 1 9 1 1 1 0 0 0 1 1 0 1 1 6 1 1 0 1 0 1 1 1 0 0 1 9 1 1 0 0 0 3 1 0 1 1 1 6 1 0 1 1 0 0 1 0 1 0 1 1 1 0 1 0 0 2 1 0 0 1 1 16 1 0 0 1 0 1 1 0 0 0 1 21 2 1 1 1 1 72 2 1 1 1 0 6 2 1 1 0 1 64 2 1 1 0 0 10 2 1 0 1 1 26 2 1 0 1 0 2 2 1 0 0 1 15 2 1 0 0 0 28 2 0 1 1 1 11 2 0 1 1 0 4 2 0 1 0 1 14 2 0 1 0 0 4 2 0 0 1 1 10 2 0 0 1 0 6 2 0 0 0 1 3 end label define group 1 "Biologic naïve" 2 "Biologic treated" label values group group label var a "ACT controlled" label var e "No exacerbations" label var f "FEV ≥80% predicted" label var o "No OCS" upset_plot a e f o [fweight=freq], over(group) set(gap(0.4)) legend(pos(6) rows(1))
Example 3
Perhaps the most famous and enduring example of a Venn diagram is that by D'Hont et al., a six-way behemoth with some... aptly shaped groups[4]. A similarly whimsical diagram was published by Beale et al. a couple of years later[5].
Here, we use D'Hont's data. With a few scaling tweaks to accommodate the considerable number of distinct intersection patterns (referred to as sequence clusters in the context of the study), we can summarize the information in a far more readable, if less charming, UpSet plot.
Code:
clear input byte(p m b s o a) float freq 1 1 1 1 1 1 7674 1 1 1 1 1 0 685 1 1 1 1 0 1 113 1 1 1 1 0 0 24 1 1 1 0 1 1 80 1 1 1 0 1 0 18 1 1 1 0 0 1 7 1 1 1 0 0 0 12 1 1 0 1 1 1 149 1 1 0 1 1 0 62 1 1 0 1 0 1 23 1 1 0 1 0 0 19 1 1 0 0 1 1 28 1 1 0 0 1 0 35 1 1 0 0 0 1 206 1 1 0 0 0 0 467 1 0 1 1 1 1 258 1 0 1 1 1 0 190 1 0 1 1 0 1 11 1 0 1 1 0 0 23 1 0 1 0 1 1 5 1 0 1 0 1 0 12 1 0 1 0 0 1 3 1 0 1 0 0 0 25 1 0 0 1 1 1 21 1 0 0 1 1 0 42 1 0 0 1 0 1 4 1 0 0 1 0 0 49 1 0 0 0 1 1 6 1 0 0 0 1 0 32 1 0 0 0 0 1 105 1 0 0 0 0 0 769 0 1 1 1 1 1 1458 0 1 1 1 1 0 368 0 1 1 1 0 1 54 0 1 1 1 0 0 13 0 1 1 0 1 1 29 0 1 1 0 1 0 28 0 1 1 0 0 1 7 0 1 1 0 0 0 9 0 1 0 1 1 1 71 0 1 0 1 1 0 64 0 1 0 1 0 1 21 0 1 0 1 0 0 49 0 1 0 0 1 1 13 0 1 0 0 1 0 29 0 1 0 0 0 1 155 0 1 0 0 0 0 759 0 0 1 1 1 1 206 0 0 1 1 1 0 2809 0 0 1 1 0 1 14 0 0 1 1 0 0 402 0 0 1 0 1 1 18 0 0 1 0 1 0 547 0 0 1 0 0 1 10 0 0 1 0 0 0 387 0 0 0 1 1 1 40 0 0 0 1 1 0 1151 0 0 0 1 0 1 9 0 0 0 1 0 0 827 0 0 0 0 1 1 6 0 0 0 0 1 0 1246 0 0 0 0 0 1 1187 0 0 0 0 0 0 0 end label var p "Phoenix" label var m "Musa" label var b "Brachypodium" label var s "Sorghum" label var o "Oryza" label var a "Arabidopsis" local intopts ylabel(,tlength(0.01) labgap(0.005)) ytitle(,titlegap(0.06)) local setopts gap(0.08) ysize(0.2) upset_plot p m b s o a [fw=freq], xsize(*2) intopts(`intopts') setopts(`setopts')
You can add additional elements to the graph via the addplot() option, although the scope for this is limited to immediate commands (e.g., twoway scatteri) and those that do not rely the underlying data (e.g., twoway function). Behind the scenes, upset_plot significantly rescales the data, so any added elements will likewise need to be rescaled – the details of this are briefly discussed in the help file.
An example is given below in the spirit of D'Hont's Venn diagram.
Code:
local intopts ylabel(,tlength(0.01) labgap(0.005)) ytitle(,titlegap(0.06))
local setopts gap(0.08) ysize(0.2)
local addopts range(0.22 0.82) lwidth(vvthick)
local addplot ///
(function y = 0.5 * (2*x - 0.9)^2 + 1.6, col(gold) `addopts') ///
(function y = 0.5 * (2.6*x - 1.3)^2 + 1.5, col(gold*.8) `addopts') ///
(function y = 0.7 * (2.4*x -1.2)^2 + 1.4, col(gold*.6) `addopts') ///
(scatteri 1.77 0.22, msymbol(O) col(brown) msize(vlarge))
upset_plot p m b s o a [fw=freq], xsize(*2) intopts(`intopts') setopts(`setopts') addplot(`addplot')
References
- Lex, A., Gehlenborg, N., Strobelt, H. et al. (2014). UpSet: Visualization of Intersecting Sets. IEEE transactions on visualization and computer graphics, 20(12), 1983–1992. https://doi.org/10.1109/TVCG.2014.2346248
- Emmons, AL., Mundorff, AZ., Hoeland, KM. et al. (2012) Postmortem Skeletal Microbial Community Composition and Function in Buried Human Remains. mSystems, 7(2). https://doi.org/10.1128/msystems.00041-22
- Jaun, F., Boesing, M., Lüthi-Corridori, G. et al. (2025). Clinical Remission in Severe Asthma: A Comparative Analysis of Patients with and Without Biologics from the Swiss Severe Asthma Registry. Biomedicines, 13(12), 3074. https://doi.org/10.3390/biomedicines13123074
- D’Hont, A., Denoeud, F., Aury, JM. et al. (2012) The banana (Musa acuminata) genome and the evolution of monocotyledonous plants. Nature, 488, 213–217. https://doi.org/10.1038/nature11241
- Neale, DB., Wegrzyn, JL., Stevens, KA. et al. (2014). Decoding the massive genome of loblolly pine using haploid DNA and novel assembly strategies. Genome biology, 15(3), R59. https://doi.org/10.1186/gb-2014-15-3-r59

Comment