Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to draw a bar chart with bars representing row percents in Stata?

    I have two categorical variables x (values: 0, 1) and y (values: 0-7) and I have also generated an ID by gen ID = _n. I can get the row percentages by the following syntax.

    tab x y, row

    My goal is to get a graph similar to the following graph with row percentages represented by the bars within each category of x. I used the following syntax to produce the graph of counts.

    graph bar (count) id, over(y) over(x) title(Cluster 1) ytitle(Count) b1title(Number of conditions)
    Click image for larger version

Name:	Q.png
Views:	1
Size:	13.1 KB
ID:	1414005


    But when I use the following syntax it gives me cell percentages in the bars.

    graph bar (percent) id, over(y) over(x) title(Cluster 1) ytitle(Percent) b1title(Number of conditions)

    But what I want is the row percentages in the bars, not the cell percentages. In other words, I want the percentages within the total of each category of x. Just like plotting the row percentages information found in

    tab x y, row

    Could you please tell me how I can do this in Stata?
    Last edited by Blain Waan; 10 Oct 2017, 17:51.

  • #2
    Cross-posted at https://stackoverflow.com/questions/...cents-in-stata

    Our cross-posting policy is explicit: you are asked to tell us about it. Not doing so is widely regarded as a breach of etiquette. This is explained in the FAQ Advice, which you were asked to read before posting.

    It was suggested on Stack Overflow that you provide a data example. That hold good here too.

    Comment


    • #3
      Originally posted by Nick Cox View Post
      Cross-posted at https://stackoverflow.com/questions/...cents-in-stata

      Our cross-posting policy is explicit: you are asked to tell us about it. Not doing so is widely regarded as a breach of etiquette. This is explained in the FAQ Advice, which you were asked to read before posting.

      It was suggested on Stack Overflow that you provide a data example. That hold good here too.
      Thanks for mentioning Nick. I've just joined this forum and I was unaware of the cross-posting policy. I should have noticed it before posting. The question was cross-posted in https://stackoverflow.com/questions/...cents-in-stata

      Comment


      • #4
        The following reproducible example is also cross-posted here from https://stackoverflow.com/questions/...cents-in-stata below:

        Say, I use auto data and see the row percentages of rep78 within each category of foreign using the following code:

        sysuse auto tab foreign rep78, row

        Now I want to plot these row percentages in the bars using a graph similar to the one shown above. I can get the counts in the bars using the following codes, but how do I get the row percentages represented in the bars?

        gen id=_n graph bar (count) id, over(rep78) over(foreign)

        Note that I can get cell percentages in the bars using the syntax:

        graph bar (percent) id, over(rep78) over(foreign)

        But how do I get the row percentages represented by the bars for each category of foreign?

        Comment


        • #5
          Thanks for the example. Again, FAQ Advice #12 explains how to present code. One way to do this is with catplot (SSC). You can add the option recast(bar) to get vertical bars, and tinker with the other options, but my view is that horizontal bars generally work better.

          Code:
          sysuse auto, clear
          tab foreign rep78, row
          gen id=_n
          graph bar (count) id, over(rep78) over(foreign)
          graph bar (percent) id, over(rep78) over(foreign)
          
          capture ssc inst catplot
          
          catplot rep78 foreign, percent(foreign) bar(1, bfcolor(none)) blabel(bar, pos(base) format(%3.2f))
          Attached Files
          Last edited by Nick Cox; 11 Oct 2017, 01:02.

          Comment


          • #6
            Here is another way to do it without downloading any community-contributed programs.

            Code:
            sysuse auto, clear 
            egen pc = pc(1) if rep78 < ., by(foreign)
            graph hbar (sum) pc, over(rep78) over(foreign) blabel(bar, pos(base) format(%3.2f)) bar(1, bfcolor(none)) ytitle(% by foreign)
            This isn't a different solution, really, as catplot is based on this kind of calculating in advance what you want to show

            By the way, I don't endorse 2 d.p. here; I was focusing on showing what tabulate shows.

            Comment


            • #7
              Thank you Nick. Could you please explain what pc(1) outputs here from the egen command. I have difficulty understanding it. I have also found that

              graph bar (percent) foreign, over(rep78) by(foreign)

              produces similar results.

              Comment


              • #8
                egen, pc() is documented so the best way to understand it is look at examples. It's essentially reverse engineering to get the numbers that when you add them will give 100% within each group, which is what you are asking for.

                Consider


                Code:
                . sysuse auto, clear 
                (1978 Automobile Data)
                
                . egen pc = pc(1) if rep78 < ., by(foreign)
                (5 missing values generated)
                . 
                . tabdisp rep78 foreign, cell(pc) 
                
                ------------------------------
                Repair    |
                Record    |      Car type     
                1978      | Domestic   Foreign
                ----------+-------------------
                        1 | 2.083333          
                        2 | 2.083333          
                        3 | 2.083333  4.761905
                        4 | 2.083333  4.761905
                        5 | 2.083333  4.761905
                        . |                   
                ------------------------------
                
                . count if foreign == 0 & rep78 < .
                  48
                
                . di 100/48
                2.0833333
                
                . count if foreign == 1 & rep78 < .
                  21
                
                . di 100/21
                4.7619048

                You are right that you have another solution there. It's not quite what you asked for but if you don't mind that's all fine.

                In essence I am more familiar with catplot (SSC), which I've been using since 2003, than with the graph bar (percent) syntax introduced by StataCorp about a decade later.

                Comment


                • #9
                  Ah! Thanks for explaining so nicely. That was a really helpful.

                  Comment

                  Working...
                  X