Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stacked barchart displaying proportion of each category over different years

    Hello everyone, I have managed to create the stacked barchart I want, but in order to do so, I had to manually calculate proportions which is inefficient if I want to do many of these! I was therefore hoping you could recommend a way to achieve the same with code alone.

    An example of my dataset consists of two variables as below:

    Code:
    clear
    input long year procedure
    2010 1
    2010 1
    2010 2
    2010 3
    2011 1
    2011 2
    2011 2
    2011 3
    2011 3
    2012 1
    2012 2
    2012 3
    2012 3
    end
    What I did was first use the code:
    Code:
    tab procedure year, col
    to manually obtain the proportions of each procedure within each year, and then re-input the following:
    Code:
    clear
    input long procedure year prop
    1 2010 .5
    1 2011 .2
    1 2012 .25
    2 2010 .25
    2 2011 .4
    2 2012 0.25
    3 2010 .25
    3 2011 .4
    3 2012 .5
    end
    
    graph bar (asis) prop, over(procedure) over(year) asyvars stack
    which outputs my desired graph.
    Click image for larger version

Name:	graph 1.png
Views:	1
Size:	38.1 KB
ID:	1685048


    Any way to achieve the same directly from the initial dataset? Also, if a proportion is equal to zero for a certain year, the option "stack" reports an error- any way around it?


    Many thanks in advance

  • #2
    One way to get what you want is with catplot from SSC.

    A way to get a better graph (in my view) is with tabplot from the Stata Journal.

    Here's a script you can run after installing those commands.

    Code:
    clear
    input long year procedure
    2010 1
    2010 1
    2010 2
    2010 3
    2011 1
    2011 2
    2011 2
    2011 3
    2011 3
    2012 1
    2012 2
    2012 3
    2012 3
    end
    
    catplot procedure year, percent(year) stack asyvars recast(bar)
    
    tabplot procedure year, percent(year) showval separate(procedure) subtitle(% in year)

    Comment


    • #3
      Nick Cox thanks very much for this. Based on your suggestion, I found the following code gave me what I needed:

      Code:
      graph bar (count), over(procedure) over (year)  percent stack asyvars blabel(bar, position(base) format(%9.1f)
      Do you know of any way to make the height of the bars reflect the absolute frequencies instead? That way, the height of each bar would vary depending on the number of procedures each year, and the proportion would be reflected in the proportion of each bar occupied by a particular colour? Example of what I mean below:

      Comment


      • #4
        How about omitting percent?

        Comment


        • #5
          Nick Cox I can't believe I didnt try that earlier! Thanks! Sorry for the many questions- would you know how to superimpose as numbers the actual percentage or fraction of each "procedure" rather than its absolute number which appears with the below?

          Code:
           
           graph bar (count), over(procedure) over (year) stack asyvars blabel(bar, position(base) format(%9.1f)

          Comment


          • #6
            You can draw the graph and then edit the numbers in the Graph Editor. tabplot also allows that provided you calculate the percents first.

            Comment


            • #7
              Nick Cox thanks so much for your help!

              Comment


              • #8
                graph bar and graph hbar seem to rule out showing one set of numbers as graph elements and another as text, beyond what is shown at

                help blabel_option

                On the face of it, even if your own rationale is clear, there is a major risk of confusing your readers.

                That said, tabplot lets you show the contents of any named variable. Here is an example showing some technique.


                Code:
                clear
                input long year procedure
                2010 1
                2010 1
                2010 2
                2010 3
                2011 1
                2011 2
                2011 2
                2011 3
                2011 3
                2012 1
                2012 2
                2012 3
                2012 3
                end
                
                bysort year: gen total = _N 
                bysort procedure year : gen count = _N 
                by procedure year : gen percent = 100 * _N / total  
                gen toshow = strofreal(count) + " (" + strofreal(percent, "%2.1f") + "%)"
                
                set scheme s1color 
                tabplot procedure year,  showval(toshow) separate(procedure) subtitle(counts and % by year) xtitle("")

                Click image for larger version

Name:	showval.png
Views:	1
Size:	16.6 KB
ID:	1690216



                Comment

                Working...
                X