Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Multiple bars in a single chart

    Hi all,
    I have very basic question that I cannot find an answer to. I have tried all sorts of options from stacked bar chart to caplets, but can't seem to figure out how to make a chart that's very simple in Excel! The chart I am trying to create is something like the one attached here. Basically, it's survey data- Each of the bars are different variables that rate an organization's performance on those items from highly unsatisfactory (-2) to highly satisfactory (+2 ). Each of the rows is ratings by respondents. I can do one bar at a time, but I would like all bars to be in the same chart for compact presentation. Anyway to do this in Stata or should I export data to Excel for this? So, basically each of the slices in a bar would be the frequency of each category of responses within a variable (e.g., 10 highly unsatisfactory, 5 unsatisfactory.... on var1, etc.)

    Any help will be most appreciated. Thanks.

    The data is as follows:
    ID var1 var2 var3 var4 var5 var6
    1 -2 1 0 -1 2 1
    2 2 1 0 1 2 1
    3 2 1 0 -1 2- 1
    4 -2 1 0 -1 2- -1

    Click image for larger version

Name:	Picture1.png
Views:	1
Size:	70.7 KB
ID:	1583326

    Attached Files

  • #2
    There is an ugly solution of producing one bar chart for each variable and then using graph combine.

    A better solution is just to
    reshape long so that you then have just two variables. The graphs here look a little silly based on just 4 observations, but they could be a start. You would need to do a little work copying your variable labels to value labels of an aggregated variable, as otherwise they disappear during the reshape long. The code shows some technique for that.

    The
    catplot (SSC)code gives a stacked chart broadly similar to what you asked for. It seems to me that your colours should be ordered. There is no doubt similar code with graph hbar (percent) or some such but I am less fluent in that corner of graph hbar than with catplot.

    My bias is that stacked plots are oversold. The fact that the percents add to 100% is emphasised strongly by the design, but we know that. The price of stacking is a legend and difficulty reading off very small quantities (including zero) when they occur. I usually prefer what might be called a twoway bar chart although many other names are used too. There is more at https://www.statalist.org/forums/for...updated-on-ssc and http://www.stata-journal.com/article...article=gr0066

    The 0.16 is just the result of trial and error to get better alignment. For other datasets different adjustments might be needed.

    Code:
    clear
    input ID var1 var2 var3 var4 var5 var6
    1 -2 1 0 -1 2 1
    2 2 1 0 1 2 1
    3 2 1 0 -1 2- 1
    4 -2 1 0 -1 2- -1
    end
    
    * invent silly variable labels: none in example
    
    tokenize "frogs toads newts cats dogs horses"
    
    forval j = 1/6 {
    label var var`j' "``j''"
    }
    
    describe
    
    * save variable labels
    
    forval j = 1/6 {
    local lbl`j' : var label var`j'
    }
    
    preserve
    
    reshape long var, i(ID) j(which)
    
    * saved variable labels become value labels
    
    forval j = 1/6 {
    label def which `j' "`lbl`j''", modify
    }
    
    label val which which
    
    * install from SSC
    
    catplot var which , asyvars stack percent(which) legend(row(1)) ///
    bar(1, color(red*0.6)) bar(2, color(red*0.2)) bar(3, color(blue*0.2)) ///
    bar(4, color(blue*0.6)) bar(5, color(blue)) name(G1, replace)
    
    * install from SJ
    
    tabplot which var, horizontal separate(var) percent(which) subtitle(% for each variable) ///
    bar1(lc(red) fcolor(red*0.6)) bar2(lc(red) fcolor(red*0.2)) bar3(lc(blue) fcolor(blue*0.2)) ///
    bar4(lc(blue) fcolor(blue*0.6)) bar5(color(blue)) showval(offset(0.17) format(%2.0f)) xtitle(rating) ytitle("") name(G2, replace)
    Click image for larger version

Name:	ratings_G1.png
Views:	1
Size:	20.3 KB
ID:	1583402


    Click image for larger version

Name:	ratings_G2.png
Views:	1
Size:	23.4 KB
ID:	1583406

    Last edited by Nick Cox; 25 Nov 2020, 17:24.

    Comment


    • #3
      Dear Nick,
      Thanks a million for your ultrafast and perfect response! I had already installed catplot and tabplot, but didn't realize I need to reshape long. That's where I was stuck. Thank you!!! A very happy Thanksgiving to you and yours!
      Best,
      Pete

      Comment


      • #4
        Originally posted by Nick Cox View Post
        There is an ugly solution of producing one bar chart for each variable and then using graph combine.

        A better solution is just to
        reshape long so that you then have just two variables. The graphs here look a little silly based on just 4 observations, but they could be a start. You would need to do a little work copying your variable labels to value labels of an aggregated variable, as otherwise they disappear during the reshape long. The code shows some technique for that.

        The
        catplot (SSC)code gives a stacked chart broadly similar to what you asked for. It seems to me that your colours should be ordered. There is no doubt similar code with graph hbar (percent) or some such but I am less fluent in that corner of graph hbar than with catplot.

        My bias is that stacked plots are oversold. The fact that the percents add to 100% is emphasised strongly by the design, but we know that. The price of stacking is a legend and difficulty reading off very small quantities (including zero) when they occur. I usually prefer what might be called a twoway bar chart although many other names are used too. There is more at https://www.statalist.org/forums/for...updated-on-ssc and http://www.stata-journal.com/article...article=gr0066

        The 0.16 is just the result of trial and error to get better alignment. For other datasets different adjustments might be needed.

        Code:
        clear
        input ID var1 var2 var3 var4 var5 var6
        1 -2 1 0 -1 2 1
        2 2 1 0 1 2 1
        3 2 1 0 -1 2- 1
        4 -2 1 0 -1 2- -1
        end
        
        * invent silly variable labels: none in example
        
        tokenize "frogs toads newts cats dogs horses"
        
        forval j = 1/6 {
        label var var`j' "``j''"
        }
        
        describe
        
        * save variable labels
        
        forval j = 1/6 {
        local lbl`j' : var label var`j'
        }
        
        preserve
        
        reshape long var, i(ID) j(which)
        
        * saved variable labels become value labels
        
        forval j = 1/6 {
        label def which `j' "`lbl`j''", modify
        }
        
        label val which which
        
        * install from SSC
        
        catplot var which , asyvars stack percent(which) legend(row(1)) ///
        bar(1, color(red*0.6)) bar(2, color(red*0.2)) bar(3, color(blue*0.2)) ///
        bar(4, color(blue*0.6)) bar(5, color(blue)) name(G1, replace)
        
        * install from SJ
        
        tabplot which var, horizontal separate(var) percent(which) subtitle(% for each variable) ///
        bar1(lc(red) fcolor(red*0.6)) bar2(lc(red) fcolor(red*0.2)) bar3(lc(blue) fcolor(blue*0.2)) ///
        bar4(lc(blue) fcolor(blue*0.6)) bar5(color(blue)) showval(offset(0.17) format(%2.0f)) xtitle(rating) ytitle("") name(G2, replace)
        [ATTACH=CONFIG]n1583402[/ATTACH]

        [ATTACH=CONFIG]n1583406[/ATTACH]
        Last edited by Mary Atieno; 22 Sep 2022, 02:36.

        Comment


        • #5
          Please note our longstanding request not to attach .dta files but to use dataex to give example data. https://www.statalist.org/forums/help#stata

          Otherwise the same advice applies. reshape long first.

          Then consider using tabplot as above.

          For this kind of data you have the extra option of a floating or sliding bar chart as implemented by floatplot from SSC. See e.g. https://www.statalist.org/forums/for...kert-variables

          Comment


          • #6
            Hello Nick, how do I rescale the label sizes so that they are neat and legible, without having to open graph editor every time I re-run the code? See attached figure
            Attached Files

            Comment


            • #7
              Originally posted by Nick Cox View Post
              Please note our longstanding request not to attach .dta files but to use dataex to give example data. https://www.statalist.org/forums/help#stata

              Otherwise the same advice applies. reshape long first.

              Then consider using tabplot as above.

              For this kind of data you have the extra option of a floating or sliding bar chart as implemented by floatplot from SSC. See e.g. https://www.statalist.org/forums/for...kert-variables
              Thank you Nick, my apologies for this. The code actually worked when I adjusted some variables with the little exception of the question I have plotted about rescaling the label sizes for the "key"

              Comment


              • #8
                Again, the first link in #5 explains our longstanding request not to add .gph attachments, not to post .png.

                I can't reconcile your sample graph and your posted data. Your data show 5 possible outcomes. Your graph shows 6. Your data show 5 outcomes occurring for question a, but your graph shows only 3 so far as I can see. And so on. Some but not all of the differences may stem from whether Don't knows are included.

                No matter. You just can't show a horizontal legend with 6 lengthy names readably unless you put the legend on 2 or 3 rows, and even then by doing that you lose space that you need for showing the data Or you could put a legend vertically on the right of the graph. This is standard detail with a legend() option: see its help.

                A more radical solution is to use some scheme such as ? -- - 0 + ++ to indicate answers.

                Equally showing answers for 8 questions as in your sample graph is a squeeze.

                As before, the standard stacked design is in my view a poor choice for this kind of data. Here is a token floatplot for your sample data. Adding text to the legend would increase the squeeze.

                Code:
                use sample-dataset.dta, clear 
                reshape long Ccami_, i(id2) j(Question) string 
                replace Question = substr(Question, 1, 1)
                set scheme s1color 
                floatplot Ccami_ if Q <= "h", over(Q) centre(3) fcolors(red red*0.5 gs12 blue*0.5 blue) vertical ytitle(Answer) legend(symxsize(small))
                Click image for larger version

Name:	anotherfloatplot.png
Views:	1
Size:	24.8 KB
ID:	1682924

                Comment


                • #9
                  Originally posted by Nick Cox View Post
                  Again, the first link in #5 explains our longstanding request not to add .gph attachments, not to post .png.

                  I can't reconcile your sample graph and your posted data. Your data show 5 possible outcomes. Your graph shows 6. Your data show 5 outcomes occurring for question a, but your graph shows only 3 so far as I can see. And so on. Some but not all of the differences may stem from whether Don't knows are included.

                  No matter. You just can't show a horizontal legend with 6 lengthy names readably unless you put the legend on 2 or 3 rows, and even then by doing that you lose space that you need for showing the data Or you could put a legend vertically on the right of the graph. This is standard detail with a legend() option: see its help.

                  A more radical solution is to use some scheme such as ? -- - 0 + ++ to indicate answers.

                  Equally showing answers for 8 questions as in your sample graph is a squeeze.

                  As before, the standard stacked design is in my view a poor choice for this kind of data. Here is a token floatplot for your sample data. Adding text to the legend would increase the squeeze.

                  Code:
                  use sample-dataset.dta, clear
                  reshape long Ccami_, i(id2) j(Question) string
                  replace Question = substr(Question, 1, 1)
                  set scheme s1color
                  floatplot Ccami_ if Q <= "h", over(Q) centre(3) fcolors(red red*0.5 gs12 blue*0.5 blue) vertical ytitle(Answer) legend(symxsize(small))
                  [ATTACH=CONFIG]n1682924[/ATTACH]
                  I am still using STATA version 15, which does not support floatplot. Are there any alternatives?

                  Comment


                  • #10
                    That’s a misunderstanding. floatplot is community-contributed and so not bundled with any version of Stata on installation. You must install it using

                    Code:
                    ssc install floatplot


                    it should work fine with Stata 15.

                    Correction: The code specifies version 17. At this time, I can’t recall why I did that. So, edit the code to say version 15. If there really is a reason why it needs 17, you will find out quickly….
                    Last edited by Nick Cox; 22 Sep 2022, 14:48.

                    Comment


                    • #11
                      Originally posted by Nick Cox View Post
                      Again, the first link in #5 explains our longstanding request not to add .gph attachments, not to post .png.

                      I can't reconcile your sample graph and your posted data. Your data show 5 possible outcomes. Your graph shows 6. Your data show 5 outcomes occurring for question a, but your graph shows only 3 so far as I can see. And so on. Some but not all of the differences may stem from whether Don't knows are included.

                      No matter. You just can't show a horizontal legend with 6 lengthy names readably unless you put the legend on 2 or 3 rows, and even then by doing that you lose space that you need for showing the data Or you could put a legend vertically on the right of the graph. This is standard detail with a legend() option: see its help.

                      A more radical solution is to use some scheme such as ? -- - 0 + ++ to indicate answers.

                      Equally showing answers for 8 questions as in your sample graph is a squeeze.

                      As before, the standard stacked design is in my view a poor choice for this kind of data. Here is a token floatplot for your sample data. Adding text to the legend would increase the squeeze.

                      Code:
                      use sample-dataset.dta, clear
                      reshape long Ccami_, i(id2) j(Question) string
                      replace Question = substr(Question, 1, 1)
                      set scheme s1color
                      floatplot Ccami_ if Q <= "h", over(Q) centre(3) fcolors(red red*0.5 gs12 blue*0.5 blue) vertical ytitle(Answer) legend(symxsize(small))
                      [ATTACH=CONFIG]n1682924[/ATTACH]
                      Last edited by Mary Atieno; 22 Sep 2022, 14:53.

                      Comment


                      • #12
                        Originally posted by Nick Cox View Post
                        That’s a misunderstanding. floatplot is community-contributed and so not bundled with any version of Stata on installation. You must install it using

                        Code:
                        ssc install floatplot


                        it should work fine with Stata 15.

                        Correction: The code specifies version 17. At this time, I can’t recall why I did that. So, edit the code to say version 15. If there really is a reason why it needs 17, you will find out quickly….
                        I used this code

                        replace Question = substr(Question, 1, 1)
                        set scheme s1color
                        floatplot Ccami_ if Q <= "h", over(Q) centre(3) fcolors(red red*0.5 gs12 blue*0.5 blue) vertical ytitle(Answer) legend(symxsize(small))


                        and got the error message
                        this is version 15.1 of Stata; it cannot run version 17.0 programs
                        You can purchase the latest version of Stata by visiting http://www.stata.com.

                        Comment


                        • #13
                          I already answered #12 in the Correction you cite.

                          The code specifies version 17. At this time, I can’t recall why I did that. So, edit the code to say version 15. If there really is a reason why it needs 17, you will find out quickly

                          So, use a text editor to change an early statement in floatplot.ado from


                          Code:
                          version 17
                          to

                          Code:
                          version 15
                          and then save the changed file Then type
                          Code:
                          discard
                          to flush its command code from memory. Now try again.

                          Comment

                          Working...
                          X