Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating bar graph

    Hello,

    I would like to make a bar graph that shows the percentage of 3 types of people in an occupation. Then I would like to do the same thing for 20 occupations. My data set looks like

    Category Farmer Teacher Doctor
    1 32 65 73
    2 4 64 5
    3 6 76 123

    Can someone help me with this?

  • #2
    I'm not with my Stata at this momemtn, but you may try something like: graph bar Farmer Teacher Doctor, over(Category) percent
    Last edited by Marcos Almeida; 10 Oct 2019, 05:16. Reason: Edit: percent instead of percentage, as pointed by Nick
    Best regards,

    Marcos

    Comment


    • #3
      I imagine that you are thinking of something like

      Code:
      clear
      input Category Farmer Teacher Doctor
      1 32 65 73
      2 4 64 5
      3 6 76 123
      end
      
      graph bar (asis) Farmer Teacher Doctor, over(Category) percent stack yla(, ang(h)) legend(order(3 2 1) col(1) pos(9)) name(G1)
      Click image for larger version

Name:	jenny1.png
Views:	1
Size:	16.3 KB
ID:	1519820



      -- but that is not going to scale well to 20 occupations. The legend alone will be massive, and even with 20 distinct colours, the result will just be a fruit salad mess.


      I recommend instead a different data layout and a different command.

      Code:
      * the next two commands were omitted at first posting 
      rename (Farmer-Doctor) count= 
      reshape long count, i(Category) j(Occupation) string 
      
      
      
      * ssc install tabplot
      tabplot Occupation Category [fw=count], percent(Category) showval blcolor(blue) bfcolor(ltblue) subtitle(%)
      where you should remove the comment marker * if you have not previously installed tabplot (SSC holds the most recent version as I write).

      Click image for larger version

Name:	jenny2.png
Views:	1
Size:	16.4 KB
ID:	1519821




      Extended to 20 occupations that design will look crowded but with horizontal axis labels you don't need a legend, and you avoid a multicolour mishmash. To get more readable results still, play with the aspect ratio and/or the axis sizes.

      Note. I use scheme s1color as default.
      Last edited by Nick Cox; 10 Oct 2019, 05:41.

      Comment


      • #4
        Thanks, Marcos. I tried this and it seems close but not exactly what I wanted. I want to know what percent of all farmers are in category 1, category 2, etc. This code tells me what percent of all people in a given category are farmers, teachers, and doctors.

        Comment


        • #5
          In #3 you evidently need percent(Occupation)

          (Sorry, but with 3 categories and 3 occupations in the example it is easy to get confused.)

          Comment


          • #6
            Thanks so much, Nick. I appreciate your help. But I do not have an occupation variable. I just have how many people are there in each occupation and how they are distributed across 3 categories. I want to see what percent of people in a given occupation belong to category 1, category 2, etc. How can I do it without having an Occupation variable? When I try your code by replacing Occupation with Farmer Doctor Teacher, Stata tells me "count not found."

            Comment


            • #7
              Sorry, my mistake: I missed out some code in #3 when posting. I will edit to include the omitted commands.

              Comment


              • #8
                It works! Many thanks!!

                Comment


                • #9
                  Nick told you that

                  I recommend instead a different data layout and a different command.
                  But I suppose he forgot to show you what the different layout should be and how to get there.

                  Code:
                  . clear
                  
                  . input Category Farmer Teacher Doctor
                  
                        Category     Farmer    Teacher     Doctor
                    1. 1 32 65 73
                    2. 2 4 64 5
                    3. 3 6 76 123
                    4. end
                  
                  .
                  . ds Category, not
                  Farmer   Teacher  Doctor
                  
                  . local varl = r(varlist)
                  
                  . foreach var of local varl {
                    2.     rename `var' count`var'
                    3. }
                  
                  . reshape long count, i(Category) j(occupation) string
                  (note: j = Doctor Farmer Teacher)
                  
                  Data                               wide   ->   long
                  -----------------------------------------------------------------------------
                  Number of obs.                        3   ->       9
                  Number of variables                   4   ->       3
                  j variable (3 values)                     ->   occupation
                  xij variables:
                     countDoctor countFarmer countTeacher   ->   count
                  -----------------------------------------------------------------------------
                  
                  .
                  . list, sepby(Category)
                  
                       +-----------------------------+
                       | Category   occupa~n   count |
                       |-----------------------------|
                    1. |        1     Doctor      73 |
                    2. |        1     Farmer      32 |
                    3. |        1    Teacher      65 |
                       |-----------------------------|
                    4. |        2     Doctor       5 |
                    5. |        2     Farmer       4 |
                    6. |        2    Teacher      64 |
                       |-----------------------------|
                    7. |        3     Doctor     123 |
                    8. |        3     Farmer       6 |
                    9. |        3    Teacher      76 |
                       +-----------------------------+
                  
                  .
                  . tabplot occupation Category [fw=count], ///
                  >     percent(occupation) showval blcolor(blue) bfcolor(ltblue) ///
                  >     subtitle(% withing occupation)
                  ---------------------------------
                  Maarten L. Buis
                  University of Konstanz
                  Department of history and sociology
                  box 40
                  78457 Konstanz
                  Germany
                  http://www.maartenbuis.nl
                  ---------------------------------

                  Comment


                  • #10
                    Thanks, Maarten! I really appreciate your reply. I converted all variables as you suggested and then plotted them as Nick suggested. Unfortunately, I still have a problem. I do not see how each occupation is distributed across the categories in the data.

                    Here is what I wrote:

                    ds category, not

                    local varl = r(varlist)

                    foreach var of local varl {
                    rename `var' count`var' }

                    reshape long count, i(category) j(Occupation) string


                    * ssc install tabplot
                    tabplot Occupation category [fw=count], percent(category) showval blcolor(blue) bfcolor(ltblue) subtitle(%)


                    On what is plotted, I see what percentage of category 1 is composed of occupation 1 but I do not see how occupation is distributed across categories.

                    Comment


                    • #11
                      Postings are crossing here. See #5 again.

                      Comment


                      • #12
                        Perfectly works. Thank you!

                        Comment

                        Working...
                        X