Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stacked bar charts for multiple variables with the same categories

    Hello,

    I have two variables with the same categories (5 point Likert scale). I'd like to create a horizontal bar chart (stacked) that presents the distribution of responses into the 5 categories for both variables, with the bars for both variables presented above each other.

    I have already created separate variables for each category of the answer options, and I am able to create the bar chart for 1 variable witht the following code:

    Code:
           graph hbar (sum) s_dis_force dis_force neutr_force agr_force s_agr_force, ///
            bar(1, color(red)) bar(2, color(red*0.5)) ///
            bar(3, color(gs8)) bar(4, color(green*0.5)) ///
            bar(5, color(green)) ///
            legend(label(1 "Strongly disagree") label(2 "Disagree") ///
            label(3 "Neutral") label(4 "Agree") label(5 "Strongly agree") ///
            size(small) order(1 2 3 4 5) rows(1) position(6)) ///
            stack percent ytitle("%", orientation(horizontal))
    This leads to the following graph:

    Click image for larger version

Name:	bar.png
Views:	1
Size:	53.0 KB
ID:	1754183


    What I need is something like the following (different example, but same idea -- 2 variables with the same categories and 1 legend + variable labels):

    Click image for larger version

Name:	bar2.png
Views:	1
Size:	43.1 KB
ID:	1754184



    The variables of the categories for the second variable are named:
    Code:
    s_dis_important dis_important neutr_important s_agr_important agr_important
    Thanks in advance!

  • #2
    You could start with
    Code:
    ssc install catplot
    
    sysuse auto, clear
    
    catplot rep78, over(foreign) asyvars stack percent(foreign)
    BUT do not use red and green together (colo[u]r blindness).

    To get the same colours, you may need to reshape your dataset.

    For another approach see floatplot from SSC.

    Comment


    • #3
      Dear Nick Cox ,

      Thanks for your reply. My 2 variables are ref_important and ref_force, and both have the same 5 values of a 5-point Likert scale (it's survey data, and for both statements respondents had to indicate to what extent they agree).

      The issue is that I do not want to plot one variable over categories of the other (this is what happens with your code -- see the image below). I want to present the distribution of responses into the 5 categories for both variables separately, and stacked (basically like the second image in my first post -- Atlanticism and Europeanism in that example are also 2 separate variables with the same 5 response categories). Given the identical categories, only 1 legend is needed.

      Any idea how to approach that?

      This is the chart I get when using your code:
      Click image for larger version

Name:	wrongbar.png
Views:	1
Size:	56.6 KB
ID:	1754242

      Comment


      • #4
        As I already mentioned, a reshape is a good idea here. If you had looked at the help for floatplot, you would have found discussion and detailed code on that point. That applies to catplot too.

        Whether it can be done with your existing data layout, I didn't try to work out, as (1) I know this way (2( in principle it seems unlikely,

        You don't give a data example, so I will invent one.


        Code:
        clear 
        set obs 100
        
        gen ref_important = cond(_n <= 10, 1, cond(_n <= 25, 2, cond(_n <= 45, 3, cond(_n <= 70, 4, 5)))) 
        gen ref_force = cond(_n <= 10, 5, cond(_n <= 25, 4, cond(_n <= 45, 3, cond(_n <= 70, 2, 1))))
        
        label def ref 1 "Strongly disagree" 2 Disagree 3 Neutral 4 Agree 5 "Strongly agree"
        
        label val ref* ref 
        
        gen id = _n 
        
        reshape long ref_, i(id) j(Which) string 
        label def which 1 important 2 force 
        encode Which, gen(which) label(which)
        
        catplot ref_ which, percent(which) stack asyvars ///
        bar(1, lcolor(red) fcolor(red*0.6)) bar(2, lcolor(red) fcolor(red*0.2)) ///
        bar(3, fcolor(white) lcolor(black)) /// 
        bar(5, lcolor(blue) fcolor(blue*0.6)) bar(4, lcolor(blue) fcolor(blue*0.2)) name(G1, replace)
        
        floatplot ref_, over(which) center(3) fcolors(red*0.6 red*0.2 white blue*0.2 blue*0.6) ///
        lcolors(red red black blue blue) vertical ytitle(better text here) yline(0, lc(bg)) xtitle("") name(G2, replace)
        Click image for larger version

Name:	ref_G1.png
Views:	1
Size:	36.4 KB
ID:	1754250

        Click image for larger version

Name:	ref_G2.png
Views:	1
Size:	36.6 KB
ID:	1754251


        Many presentation details are your choice, but I strongly recommend using a graded series of colours.

        As said, red and green together are a bad idea and wouldn't get past competent reviewers. Here are some references to use of colour in graphics, and there are many more.

        Berinato, S. 2016. Good Charts: The HBR Guide to Making Smarter, More Persuasive Data Visualizations. Boston, MA: Harvard Business Review Press.

        Borland, D. and E. Taylor. 2007. Rainbow color map (still) considered harmful. IEEE Computer Graphics and Applications 27(2): 14--17.
        https://www.computer.org/csdl/magazi...14/13rRUxYrbOE

        Brewer, C.A. 2016. Designing Better Maps: A Guide for GIS Users. Redlands, CA: Esri Press.

        Crameri, F., G.E. Shephard, and P. J. Heron. 2020. The misuse of colour in science communication Nature Communications 11: 5444.
        https://doi.org/10.1038/s41467-020-19160-7

        Few, S. 2009. Now You See It: Simple Visualization Techniques for Quantitative Analysis. Oakland, CA: Analytics Press.

        Few, S. 2012. Show Me the Numbers: Designing Tables and Graphs to Enlighten. Burlingame, CA: Analytics Press.

        Hastie, T.J., R.J. Tibshirani and J.H. Friedman. 2009. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York: Springer.

        Hawkins, E. 2015. Scrap rainbow colour scales. Nature 519: 291. https://www.nature.com/articles/519291d

        Knaflic, C.N. 2015. Storytelling with Data: A Data Visualization Guide for Business Professionals. Hoboken, NJ: John Wiley.

        Light, A. and P.J. Bartlein. 2004. The end of the rainbow? Color schemes for improved data graphics. Eos 85(40): 385 and 391.
        https://agupubs.onlinelibrary.wiley....9/2004EO400002

        Okabe, M. and K. Ito. 2008. Color Universal Design (CUD): How to make figures and presentations that are friendly to colorblind people.
        http://jfly.uni-koeln.de/color/

        Wilke, C.O. 2019. Fundamentals of Data Visualization: A Primer on Making Informative and Compelling Figures. Sebastopol, CA: O'Reilly.

        Wong, B. 2010. Color coding. Nature Methods 7: 573. https://www.nature.com/articles/nmeth0810-573.pdf

        Wong, B. 2011. Color blindness. Nature Methods 8: 441. https://www.nature.com/articles/nmeth.1618.pdf Correction: 2023. Nature Methods 20: 1266.





        Comment

        Working...
        X