Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bar Charts in percentage

    Hi everyone,

    I have a simple query. I have my dataset in the following form.

    Number sex Income
    200,1, 500
    300,1, 345
    400,1, 500
    .....
    200,2,400
    300,2,460
    .......

    I want to draw bar chart of Number over Income for each sex.

    I used the following code

    graph bar (mean) Number if sex==1, over (Income)

    The bar chart I got is great. Now I also want the Y-axis (Number) and the respective bars to be in percentages rather than absolute numbers. I am not able to write a simple code to execute this.

    Any help will be appreciated.

    Thanks a ton

    Best Regards,
    Pushkar

  • #2
    Percent of what? The whole dataset? Each income group? Each sex? Each income and sex group? Depending on the answer, catplot (SSC) may help.

    Consider this worked example:

    Code:
    sysuse auto, clear
    gen himpg = mpg > 25
    label def himpg 0 low 1 high
    label val himpg himpg
    ssc inst catplot 
    catplot himpg foreign rep78, percent
    catplot himpg foreign rep78, percent(foreign)
    catplot himpg foreign rep78, percent(rep78)
    catplot himpg foreign rep78, percent(foreign rep78)

    Comment


    • #3
      Originally posted by Nick Cox View Post
      Percent of what? The whole dataset? Each income group? Each sex? Each income and sex group? Depending on the answer, catplot (SSC) may help.

      Consider this worked example:

      Code:
      sysuse auto, clear
      gen himpg = mpg > 25
      label def himpg 0 low 1 high
      label val himpg himpg
      ssc inst catplot
      catplot himpg foreign rep78, percent
      catplot himpg foreign rep78, percent(foreign)
      catplot himpg foreign rep78, percent(rep78)
      catplot himpg foreign rep78, percent(foreign rep78)
      Dear Nick,

      Thanks for the reply. Sorry if my question was not clear enough. Let me explain more clearly. Dataset is as follows:

      Number sex Income
      200,1, 500
      300,1, 345
      400,1, 500
      .....
      200,2,400
      300,2,460

      For sex 1, sum of all the observations in Number variable is 200+300+400=900
      By percent I mean, each observation of Number variable as a percent of sum of all the observations in the Number Variable.

      For example, First Observation = 500/900 = 55%
      Second Observation = 300/900 = 33% and so on


      The bar chart I want to draw will have Income as a categorical variable (on X axis) and the individual bars as percentages (Each observation / Sum of all Observations) for the Number variable on Y axis. The code "graph bar (mean) Number if sex==1, over (Income)" gives values of individual observations of Number Variable on Y axis. Is it possible to have observations of Y variable in percentage instead of absolute values?

      Thanks a ton for your time. Have a great day

      Best Regards,
      Pushkar

      Comment


      • #4
        I can't see that you even tried to relate my answer to your question. The code is code that you can run!

        Please study http://www.statalist.org/forums/help#stata for advice on giving good data examples and showing your code with CODE delimiters.

        Your example makes limited sense to me.

        Number sex Income
        200,1, 500
        300,1, 345
        400,1, 500
        .....
        200,2,400
        300,2,460

        For sex 1, sum of all the observations in Number variable is 200+300+400=900
        By percent I mean, each observation of Number variable as a percent of sum of all the observations in the Number Variable.

        For example, First Observation = 500/900 = 55%
        Second Observation = 300/900 = 33% and so on
        In terms of

        Code:
        clear
        input Number sex Income
        200 1 500
        300 1 345
        400 1 500
        200 2 400
        300 2 460
        end
        egen TotalNumber = total(Number), by(sex)
        list, sepby(sex)
        
             +----------------------------------+
             | Number   sex   Income   TotalN~r |
             |----------------------------------|
          1. |    200     1      500        900 |
          2. |    300     1      345        900 |
          3. |    400     1      500        900 |
             |----------------------------------|
          4. |    200     2      400        500 |
          5. |    300     2      460        500 |
             +----------------------------------+
        I get that the sum of Number for sex 1 is 200 + 300 + 400 = 900, but then 500/900 is Income[1]/900 and 300/900 is Number[2]/900 so I can't follow you there.

        At a very wild guess, you (should) just want a histogram. I can't see that treating income as a categorical variable is a good idea at all.

        Code:
        histogram Income [fw=Number] , by(sex) frequency width(50) start(300)
        spikeplot Income [fw=Number] , by(sex)
        The graphs look pretty silly with your example data, but presumably it's the principle that counts.
        Last edited by Nick Cox; 22 Sep 2016, 03:25.

        Comment


        • #5
          Dear Nick,

          Extremely sorry..My mistake.

          You wrote "I get that the sum of Number for sex 1 is 200 + 300 + 400 = 900, but then 500/900 is Income[1]/900 and 300/900 is Number[2]/900 so I can't follow you there."

          I meant 200/900 as Number(1) (not 500/900 as what I had mistakenly written in my earlier post), 300/900 as Number (2) and so on.

          I tried using the code you gave in your first feedback but it didn't work out.

          I will try to do what you have suggested in your second reply

          Thanks a ton and sorry for the mistake

          Best Regards
          Pushkar

          Comment

          Working...
          X