Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • multiple variables in on Histogram

    Hi
    I have a variable, 'Edugroup' that has been categorized into 4 groups. (1,2,3,4)

    . dataex Edugroup

    ----------------------- copy starting from the next line -----------------------
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float Edugroup
    2
    3
    2
    2
    3
    2
    2
    2
    3
    2
    2
    1
    4
    1
    3
    3
    3
    2
    3
    3
    3
    2
    1
    1
    3
    1
    2
    3
    3
    3
    3
    1
    2
    2
    4
    3
    1
    1
    4
    1
    1
    2
    2
    1
    3
    2
    2
    2
    3
    3
    3
    2
    5
    1
    4
    1
    2
    2
    4
    3
    4
    4
    4
    4
    4
    5
    5
    1
    1
    3
    5
    2
    1
    2
    5
    4
    3
    4
    1
    1
    4
    2
    3
    3
    3
    1
    3
    2
    1
    1
    1
    5
    1
    2
    3
    3
    4
    3
    1
    3
    end

    I need to plot a Histogram like this attachment. I know how to plot one in Excell

    Would appreciate any help.
    Paris
    Attached Files

  • #2
    I'd call that a stacked bar chart. But if you start from

    Code:
    graph bar (percent) , over(Edugroup) blabel(bar, format(%2.1f)) ysc(r(0 32)) yla(, ang(h)) t1title(Edugroup)
    you can get stacking by looking at further options of graph bar. It's a stiff challenge to improve on the unstacked version.

    Comment


    • #3
      I use this code:
      g edu_1=Edugroup== 1
      g edu_2=Edugroup== 2
      g edu_3=Edugroup== 3
      g edu_4=Edugroup== 4
      g edu_5=Edugroup== 5
      graph bar edu_1 edu_2 edu_3 edu_4 edu_5, stack

      It gives what I want, though I need percentage not mean. (The vertical axis shows the percentage). Do you have any ideas on how to switch to the percentage of education?
      Attached Files

      Comment


      • #4
        Hi Paris,

        To have a stacked bar graph with percentages, I'd do the following:

        Code:
         graph hbar, over(Edugroup) stack asyvars percentage blabel(bar, pos(center) format(%3.0f) color(black))
        You can of course play with the percentage and blabel options, but the "stack" and "asyvars" options are crucial for producing the type of graph (I think) you are looking to produce.

        Comment


        • #5
          Hi John,

          Fantastic!

          To run for several years I used this:

          graph bar Edugroup, over(year) stack asyvars percentage blabel(bar, pos(center) format(%3.0f) color(black))

          though does not work

          Comment


          • #6
            So, the real problem is for two variables. Without a full data example, let's fake one. Three possibilities here entail tabplot (Stata Journal), catplot (SSC), floatplot (SSC). For further examples, search the forum for mentions of each command.

            The option choices show things to vary, and there are others. My personal view is that the second, the conventional stacked choice (G2), is by far the least effective. Depending on what is and is not shown by the full real data, I would go for one of the others.

            Code:
            * Example generated by -dataex-. For more info, type help dataex
            clear
            input float Edugroup byte _freq year 
            1 24 2017
            2 26 .
            3 30 . 
            4 14 . 
            5  6 . 
            1 22 2018
            2 24 .
            3 29 .
            4 17 .
            5  8 .
            1 20 2019
            2 21 .
            3 31 . 
            4 18 .
            5 10 .
            1 18 2020
            2 19 .
            3 35 . 
            4 15 .
            5 13 . 
            end
            replace year = year[_n-1] if missing(year)
            expand _freq 
            
            tabplot Edugroup year, percent(year) showval yreverse separate(Edugroup) ///
            bar1(fcolor(red*0.6) lcolor(red)) bar2(fcolor(red*0.2) lcolor(red))      ///
            bar3(fcolor(blue*0.2) lcolor(blue)) bar4(fcolor(blue*0.6) lcolor(blue)) bar5(fcolor(blue)) ///
            name(G1, replace) xtitle("") subtitle(% by year)
             
            catplot Edugroup year, percent(year) asyvars stack recast(bar) ///
            legend(order(5 4 3 2 1) col(1) pos(3)) bar(1, fcolor(red*0.6) lcolor(red)) ///
            bar(2, fcolor(red*0.2) lcolor(red)) bar(3, fcolor(blue*0.2) lcolor(blue))  ///
            bar(4, fcolor(blue*0.6) lcolor(blue)) bar(5, fcolor(blue)) name(G2, replace) subtitle(% by year, place(w))
             
            floatplot Edugroup, over(year) center(3) ///
            fcolors(red*0.6 red*0.2 blue*0.2 blue*0.6 blue) lcolors(red red blue blue blue) vertical ///
            name(G3, replace) subtitle(% by year, place(w)) xtitle("")

            Click image for larger version

Name:	edugroup_G1.png
Views:	2
Size:	16.7 KB
ID:	1694808


            Click image for larger version

Name:	edugroup_G2.png
Views:	1
Size:	18.7 KB
ID:	1694809


            Click image for larger version

Name:	edugroup_G3.png
Views:	1
Size:	17.5 KB
ID:	1694810
            Attached Files

            Comment


            • #7

              catplot! that's it.

              Thank you so much Nick! & Merry Christmas

              Comment


              • #8
                I wish I hadn't written the darn thing now. It is the worst graph here: why prefer it? Stacking makes it harder to see and show any shifts in any and all of the categories Emphasising that percents add to 100% is just repeating what is known.

                Comment


                • #9
                  For now, I only access one year of Data (2010). I have tried to show the percentage composition of workers by schooling degree and I believe the Fig displays so.Enable GingerCannot connect to Ginger Check your internet connection
                  or reload the browserDisable in this text fieldRephraseRephrase current sentence3Edit in GingerĂ—
                  Last edited by Paris Rira; 24 Dec 2022, 06:04.

                  Comment


                  • #10
                    sorry for multiplying posts. the Google Grom did by itself.Enable GingerCannot connect to Ginger Check your internet connection
                    or reload the browserDisable in this text fieldRephraseRephrase current sentence5Edit in GingerĂ—
                    Last edited by Paris Rira; 24 Dec 2022, 06:04.

                    Comment


                    • #11
                      G2
                      Attached Files

                      Comment


                      • #12
                        White doesn't fall between light blue and blue.

                        Comment


                        • #13
                          You are right. I am thinking of the final paper which should be in black n white format. Perhaps some grayish and whitish...

                          Comment


                          • #14
                            The final paper needing to be in black and white -- or gray/grey colour or color -- is compelling, but the less readers need to rely on a legend or key, the better.

                            Comment


                            • #15
                              Originally posted by Paris Rira View Post
                              Hi John,

                              Fantastic!

                              To run for several years I used this:

                              graph bar Edugroup, over(year) stack asyvars percentage blabel(bar, pos(center) format(%3.0f) color(black))

                              though does not work

                              Paris, in case it's useful for future reference, all that was wrong with your code was that you needed to do "over" for both Edugroup AND for year. For example:

                              Code:
                               graph bar, /// basic bar graph command
                              over(EduGroup) /// the groups *within* each bar (values seen within bars)
                              over(Year) /// the groups *across* bars (values seen on x-axis)
                              stack asyvars /// commands needed for stacking bars
                              scheme(s1mono) /// set scheme
                              percentage blabel(bar, pos(center) format(%3.0f) color(blue)) /// add percentages
                              ytitle("Percentage of Sample") /// title y-axis (since it's hbar, y-axis is on bottom)
                              legend(pos(6) size(vsmall) row(1) span) // specifies placement and size of legend text, and # of rows
                              This graph would also be in grayscale given the "s1mono" scheme option. Obviously you can change a lot of those options to your liking, but the main point was just that you can still use "graph bar" to make a stacked bar graph across categories of a second categorical variable.

                              Comment

                              Working...
                              X