Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Plotting a stack bar

    Hi statalisters,
    For the below dataset, the variables of PUWLE, PHWLE, POWLE, and POBLE are percentages representing the proportions of TLE_01. I would like to create two separate bar plots separately for females and males. For each sex group, I would like to create a stack bar plot of the proportions noted above by sample_.
    Thanks,
    Nader

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str18 sample_ str6 sex float(PUWLE PHWLE POWLE POBLE TLE_01)
    "Non-Hispanic White" "Female" 2.2562733 32.15409 32.447655 33.141983  31.18791
    "Black"              "Female"  1.338273 16.99335  30.90398   50.7644   28.9989
    "Hispanic"           "Female" 1.5642548 24.18444 35.418617  38.83269 32.706757
    "Non-Hispanic White" "Male"   2.2562733 32.15409 32.447655 33.141983  31.18791
    "Black"              "Male"    1.338273 16.99335  30.90398   50.7644   28.9989
    "Hispanic"           "Male"   1.5642548 24.18444 35.418617  38.83269 32.706757
    end
    Last edited by Nader Mehri; 27 May 2023, 18:18.

  • #2
    What you are asking for is a standard example of graph bar, or so I guess. I throw in here using graph hbar as well, and indeed also tabplot from the Stata Journal.

    The main idea of tabplot can be seen at https://www.statalist.org/forums/for...updated-on-ssc

    but if you want to use it, it is best to download the latest (public) ado and help from the latest gr0066 shown by

    Code:
    . search gr0066, entry
    
    Search of official help files, FAQs, Examples, and Stata Journals
    
    SJ-22-2 gr0066_3  . . . . . . . . . . . . . . . .  Software update for tabplot
            (help tabplot if installed) . . . . . . . . . . . . . . . .  N. J. Cox
            Q2/22   SJ 22(2):467
            bug fixed; help file updated to include further references
    
    SJ-20-3 gr0066_2  . . . . . . . . . . . . . . . .  Software update for tabplot
            (help tabplot if installed) . . . . . . . . . . . . . . . .  N. J. Cox
            Q3/20   SJ 20(3):757--758
            added new options frame() and frameopts() allowing framing
            of bars and so-called thermometer plots or charts
    
    SJ-17-3 gr0066_1  . . . . . . . . . . . . . . . .  Software update for tabplot
            (help tabplot if installed) . . . . . . . . . . . . . . . .  N. J. Cox
            Q3/17   SJ 17(3):779
            added options for reversing axis scales; improved handling of
            axis labels containing quotation marks
    
    SJ-16-2 gr0066  . . . . . .  Speaking Stata: Multiple bar charts in table form
            (help tabplot if installed) . . . . . . . . . . . . . . . .  N. J. Cox
            Q2/16   SJ 16(2):491--510
            provides multiple bar charts in table form representing
            contingency tables for one, two, or three categorical variables
    so (at the time of writing) download from gr0066_3.

    The leading ideas of tabplot as compared with stacking bars that add to 100% are

    * You can tell the reader once that the data are percents and add to 100 within certain groups, and that should be easy to grasp. If not, you need a smarter reader. You don't need to reinforce that notion graphically.

    * You can lose the legend (kill the key) and cut down on mental back and forth.

    * You can optionally show the percents themselves. You can do that with a stacked bar chart, but the percents are not so easy to read.

    * There is no extra strain if some components are very small or even zero. That is easy to spot and think about.

    Here is some code.

    Code:
    set scheme stcolor 
    
    clear
    input str18 sample_ str6 sex float(PUWLE PHWLE POWLE POBLE TLE_01)
    "Non-Hispanic White" "Female" 2.2562733 32.15409 32.447655 33.141983  31.18791
    "Black"              "Female"  1.338273 16.99335  30.90398   50.7644   28.9989
    "Hispanic"           "Female" 1.5642548 24.18444 35.418617  38.83269 32.706757
    "Non-Hispanic White" "Male"   2.2562733 32.15409 32.447655 33.141983  31.18791
    "Black"              "Male"    1.338273 16.99335  30.90398   50.7644   28.9989
    "Hispanic"           "Male"   1.5642548 24.18444 35.418617  38.83269 32.706757
    end
    
    graph bar (asis) *LE , stack over(sex) over(sample_) name(G1, replace)
    
    graph bar (asis) *LE , stack over(sample_) over(sex) name(G2, replace)
    
    graph hbar (asis) *LE , stack over(sample_) over(sex) name(G3, replace)
    
    
    graph hbar (asis) *LE , stack over(sex) over(sample_) name(G4, replace)
    
    
    capture frame drop work 
    
    frame put *, into(work)
    
    frame work { 
        reshape long @LE , i(sample_ sex) j(which) string 
        list 
        tabplot which sample_ [iw=LE], by(sex, note("")) showval(format(%3.1f)) xtitle("") ytitle("") name(G5, replace)
        tabplot which sex [iw=LE], by(sample_, note("") row(1)) showval(format(%3.1f)) xtitle("") ytitle("") name(G6, replace)
        local opts separate(which) 
        tabplot which sample_ [iw=LE], `opts' by(sex, note("")) showval(format(%3.1f)) xtitle("") ytitle("") name(G7, replace)
        tabplot which sex [iw=LE], `opts' by(sample_, row(1) note("")) showval(format(%3.1f)) xtitle("") ytitle("") name(G8, replace)
    }
    The second graph is clearly worthless given the overlap of text labels. That could be fixed with some twiddling but I have not bothered because I think better graphs are on offer. I wanted to make a standard point that graph hbar often is preferable to graph bar

    Click image for larger version

Name:	mehri_GG1.png
Views:	1
Size:	40.0 KB
ID:	1715234


    Click image for larger version

Name:	mehri_GG2.png
Views:	1
Size:	44.6 KB
ID:	1715235
    Click image for larger version

Name:	mehri_GG3.png
Views:	1
Size:	40.0 KB
ID:	1715236
    Click image for larger version

Name:	mehri_GG4.png
Views:	1
Size:	39.3 KB
ID:	1715237


    Comment


    • #3
      The other graphs follow because the forum software limits the number of attachments per post.


      Click image for larger version

Name:	mehri_GG7.png
Views:	1
Size:	39.2 KB
ID:	1715241
      Click image for larger version

Name:	mehri_GG5.png
Views:	1
Size:	38.5 KB
ID:	1715239
      Attached Files

      Comment


      • #4
        The graphs above in #2 are out of order. Oh well.

        All that said, in your data example the values for males and females appear identical for the same variables!

        It's your project, not mine, manifestly but

        * Horizontal stacked bars can work better than vertical.

        * I think the tabplot results work better than either stacked flavour. I guess that people use stacked bar charts because they have seen so many, as many people use pie charts despite their having been shot down with rational arguments as poor designs over a century or more.

        * What you choose depends on which comparisons are more important. Minute comparisons are easiest between bars side by side. Side by side can be better or worse depend on which comparisons are more interesting or important.

        * Mixing colours is essential for stacked designs and optional otherwise.

        This thread is more or less a repeat of your earlier thread

        https://www.statalist.org/forums/for...-xlabel-s-size

        in which the main point was that stacked bars are awkward at best, lousy at worst, and better not used when there are superior choices.
        Last edited by Nick Cox; 28 May 2023, 04:02.

        Comment


        • #5
          Thank you so much for your helpful response. Based on your suggestions and my earlier thread, I have created the below plot using the below code. I wonder how my plot can be modified by 1) changing the angle and the font for x-bar values to avoid their overlapping and 2) removing the background color for the bars so the plot can be printed using a white-black printer.
          Nader

          Code:
          * Example generated by -dataex-. For more info, type help dataex
          clear
          input str1 gender byte(category_num which age) str1 sample_ float(percent TLE) str2 profile float(PUWLE PHWLE POWLE POBLE) str3 category
          "F" 0 1 50 "0"  .7100313 31.218945 "F0" 2.3 32.3 32.4   33 "F01"
          "F" 0 2 50 "0"  10.09634 31.218945 "F0" 2.3 32.3 32.4   33 "F02"
          "F" 0 3 50 "0"  10.11646 31.218945 "F0" 2.3 32.3 32.4   33 "F03"
          "F" 0 4 50 "0" 10.296116 31.218945 "F0" 2.3 32.3 32.4   33 "F04"
          "F" 1 1 50 "1"   .394898 29.063833 "F1" 1.4 17.2 31.1 50.4 "F11"
          "F" 1 2 50 "1"   4.99093 29.063833 "F1" 1.4 17.2 31.1 50.4 "F12"
          "F" 1 3 50 "1"  9.044346 29.063833 "F1" 1.4 17.2 31.1 50.4 "F13"
          "F" 1 4 50 "1" 14.633658 29.063833 "F1" 1.4 17.2 31.1 50.4 "F14"
          "F" 2 1 50 "2"  .4906612 32.613693 "F2" 1.5 23.5 35.3 39.7 "F21"
          "F" 2 2 50 "2"  7.664418 32.613693 "F2" 1.5 23.5 35.3 39.7 "F22"
          "F" 2 3 50 "2" 11.499845 32.613693 "F2" 1.5 23.5 35.3 39.7 "F23"
          "F" 2 4 50 "2"  12.95877 32.613693 "F2" 1.5 23.5 35.3 39.7 "F24"
          "F" 3 1 50 "3"  .7716495  31.10333 "F3" 2.5 34.2 31.7 31.6 "F31"
          "F" 3 2 50 "3" 10.639664  31.10333 "F3" 2.5 34.2 31.7 31.6 "F32"
          "F" 3 3 50 "3"  9.847922  31.10333 "F3" 2.5 34.2 31.7 31.6 "F33"
          "F" 3 4 50 "3"  9.844094  31.10333 "F3" 2.5 34.2 31.7 31.6 "F34"
          end
          label values category_num category_num
          label def category_num 0 "non-Hispanic White", modify
          label def category_num 1 "non-Hispanic Black", modify
          label def category_num 2 "Hispanic", modify
          label def category_num 3 "non-Hispanic other", modify
          label values which which
          label def which 1 "Underweight", modify
          label def which 2 "Healthy Weight", modify
          label def which 3 "Overweight", modify
          label def which 4 "Obes", modify
          gen toshow = string(percent, "%2.1f") + "%" if category=="F01" | category=="F02" | category=="F03" | category=="F04" | category=="F11" | category=="F12" | category=="F13" | category=="F14" | category=="F21" | category=="F22" | category=="F23" | category=="F24" | category=="F31" | ///
          | category=="F32" | category=="F33" | category=="F34"

          replace toshow= string(percent, "%2.1f") if toshow==""

          tabplot category_num which [iw=percent], height(0.63) by(gender, note("")) separate(which) bar1(blcolor(blue) bfcolor(blue*0.5)) bar2(blcolor(blue) bfcolor(blue*0.1)) bar3(bcolor(red)) showval (toshow, offset(0.2)) horizontal ytitle("") xtitle("") xscale(r(0.7 1 3.4)) subtitle(, fcolor(none)) name(G1, replace)

          jj.pdf

          Comment


          • #6
            Image attachments should please be shown as .png. https://www.statalist.org/forums/help#stata 12.4

            Comment


            • #7
              Sorry about that! Please see the plot in the .png format as follows:
              Click image for larger version

Name:	image_31244.png
Views:	1
Size:	50.0 KB
ID:	1715268

              Last edited by Nader Mehri; 28 May 2023, 09:11.

              Comment


              • #8
                I would just fix the value labels as the main problem is the longer label for 2 and you have enough space to correct the spelling for 4.

                Code:
                 
                 label def which 1 "Underweight", modify label def which 2 "Healthy", modify label def which 3 "Overweight", modify label def which 4 "Obese", modify
                If this is destined for a black and white printer, you should work throughout with a scheme such as s1mono.

                The showval() option has an offset() suboption to move the text. Here the text needs to move up a little;

                Comment


                • #9
                  Thanks! I wonder how the bars for each race category could be changed to the same color; i.e., how the color for bars for non-Hispanic Whites can be changed all to blue, while the colors for bars for non-Hispanic Black can be changed to green, etc.

                  Comment


                  • #10
                    What happened to the black-and-white printer?

                    Comment


                    • #11
                      Well, I need this for PowerPoint presentation!

                      Comment


                      • #12
                        It is just a different separate() call: separate(category_num) with bar1() ... bar4() to override default colours if you wish.

                        Comment


                        • #13
                          Thanks for your solution. I have tried the following code and got the below plot. Assigning a color to each bar is a little bit hectic particularly if one is dealing with several plots with multiple bars. I wonder if there is any way to assign a color to bars number 1 to 6, another color to bars number 7 to 12, etc.

                          Code:
                          * Example generated by -dataex-. For more info, type help dataex
                          clear
                          input str6 gender byte(category_num which age) float(Underweight Healthy Overweight Obese TLE) str3 profile byte(birth_place race) float percent str4 category str5 toshow
                          "Female"  1 1 50 .7  9.9   10 10.5   31 "F01" 1 0  2.2 "F11"  "2.2%" 
                          "Female"  1 2 50 .7  9.9   10 10.5   31 "F01" 1 0 31.7 "F12"  "31.7%"
                          "Female"  1 3 50 .7  9.9   10 10.5   31 "F01" 1 0 32.3 "F13"  "32.3%"
                          "Female"  1 4 50 .7  9.9   10 10.5   31 "F01" 1 0 33.8 "F14"  "33.8%"
                          "Female"  2 1 50 .9 12.3 10.9  9.2 33.3 "F02" 2 0  2.7 "F21"  "2.7%" 
                          "Female"  2 2 50 .9 12.3 10.9  9.2 33.3 "F02" 2 0   37 "F22"  "37.0%"
                          "Female"  2 3 50 .9 12.3 10.9  9.2 33.3 "F02" 2 0 32.7 "F23"  "32.7%"
                          "Female"  2 4 50 .9 12.3 10.9  9.2 33.3 "F02" 2 0 27.7 "F24"  "27.7%"
                          "Female" 11 1 50 .4  4.7  8.8 14.6 28.5 "F11" 1 1  1.4 "F111" "1.4%" 
                          "Female" 11 2 50 .4  4.7  8.8 14.6 28.5 "F11" 1 1 16.3 "F112" "16.3%"
                          "Female" 11 3 50 .4  4.7  8.8 14.6 28.5 "F11" 1 1   31 "F113" "31.0%"
                          "Female" 11 4 50 .4  4.7  8.8 14.6 28.5 "F11" 1 1 51.3 "F114" "51.3%"
                          "Female" 12 1 50 .6  6.3 10.2 13.7 30.8 "F12" 2 1  1.8 "F121" "1.8%" 
                          "Female" 12 2 50 .6  6.3 10.2 13.7 30.8 "F12" 2 1 20.5 "F122" "20.5%"
                          "Female" 12 3 50 .6  6.3 10.2 13.7 30.8 "F12" 2 1 33.1 "F123" "33.1%"
                          "Female" 12 4 50 .6  6.3 10.2 13.7 30.8 "F12" 2 1 44.6 "F124" "44.6%"
                          "Female" 21 1 50 .5  7.6 11.3 12.9 32.4 "F21" 1 2  1.5 "F211" "1.5%" 
                          "Female" 21 2 50 .5  7.6 11.3 12.9 32.4 "F21" 1 2 23.6 "F212" "23.6%"
                          "Female" 21 3 50 .5  7.6 11.3 12.9 32.4 "F21" 1 2   35 "F213" "35.0%"
                          "Female" 21 4 50 .5  7.6 11.3 12.9 32.4 "F21" 1 2 39.9 "F214" "39.9%"
                          "Female" 22 1 50 .7  9.7 12.5 11.7 34.5 "F22" 2 2  1.9 "F221" "1.9%" 
                          "Female" 22 2 50 .7  9.7 12.5 11.7 34.5 "F22" 2 2 28.1 "F222" "28.1%"
                          "Female" 22 3 50 .7  9.7 12.5 11.7 34.5 "F22" 2 2 36.1 "F223" "36.1%"
                          "Female" 22 4 50 .7  9.7 12.5 11.7 34.5 "F22" 2 2 33.9 "F224" "33.9%"
                          "Female" 31 1 50 .8 10.4  9.7  9.9 30.7 "F31" 1 3  2.5 "F311" "2.5%" 
                          "Female" 31 2 50 .8 10.4  9.7  9.9 30.7 "F31" 1 3 33.8 "F312" "33.8%"
                          "Female" 31 3 50 .8 10.4  9.7  9.9 30.7 "F31" 1 3 31.6 "F313" "31.6%"
                          "Female" 31 4 50 .8 10.4  9.7  9.9 30.7 "F31" 1 3 32.1 "F314" "32.1%"
                          "Female" 32 1 50  1   13 10.6  8.6 33.2 "F32" 2 3    3 "F321" "3.0%" 
                          "Female" 32 2 50  1   13 10.6  8.6 33.2 "F32" 2 3 39.3 "F322" "39.3%"
                          "Female" 32 3 50  1   13 10.6  8.6 33.2 "F32" 2 3 31.8 "F323" "31.8%"
                          "Female" 32 4 50  1   13 10.6  8.6 33.2 "F32" 2 3 25.9 "F324" "25.9%"
                          end
                          label values category_num sample_
                          label def sample_ 1 "US-born non-Hispanic White", modify
                          label def sample_ 2 "Foreign-born non-Hispanic White", modify
                          label def sample_ 11 "US-born non-Hispanic Black", modify
                          label def sample_ 12 "Foreign-born non-Hispanic Black", modify
                          label def sample_ 21 "US-born Hispanic", modify
                          label def sample_ 22 "Foreign-born Hispanic", modify
                          label def sample_ 31 "US-born non-Hispanic Other", modify
                          label def sample_ 32 "Foreign-born non-Hispanic Other", modify
                          label values which which
                          label def which 1 "Underweight", modify
                          label def which 2 "Healthy", modify
                          label def which 3 "Overweight", modify
                          label def which 4 "Obese", modify
                          label values birth_place birth_place
                          label def birth_place 1 "US-born", modify
                          label def birth_place 2 "Foreign-born", modify
                          label values race race
                          label def race 0 "non-Hispanic White", modify
                          label def race 1 "non-Hispanic Black", modify
                          label def race 2 "Hispanic", modify
                          label def race 3 "non-Hispanic Other", modify
                          tabplot category_num which [iw=percent], height(0.63) separate(category_num) by(gender , note("")) barall(bcolor(blue*0.5)) showval (toshow, offset(0.15)) horizontal ytitle("") xtitle("") xscale(r(0.7 1 3.4)) subtitle(, fcolor(none)) name(G1, replace) xlabel(, labsize(small))

                          Click image for larger version

Name:	G1.png
Views:	1
Size:	82.9 KB
ID:	1715420

                          Comment


                          • #14
                            It is the same question. Create a variable with different values for 1-6 and 7-12 and so on and then feed it to separate().

                            Comment

                            Working...
                            X