No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Histogram for multiple variables

    Hey. I'm trying to create a histogram for life satisfaction regarding unemployed, temporary workers and normal workers like this:

    Click image for larger version

Name:	Unbenannt.JPG
Views:	1
Size:	39.8 KB
ID:	1409023

    The three different bars in the histogram should show (1) standard employment relationship, (2) temporary workers and (3) unemployed.
    The x-axis should show the satisfaction of life on a scale from 0 (not satisfied) to 10 (very satisfied).
    The y-axis should show the proportion in %.

    I have following variables in Stata:
    - lifesatisfaction
    - temporarywork (1, 2): 1= yes= temporary worker; 0= no= standard employment relationship
    - unemployed (3)

    Could someone please help me to get the right command?

    Thanks a lot!

  • #2
    Please read "What to say about your data" in the FAQ and post an excerpt from your data with dataex.


    • #3
      Friedrich gives excellent advice. Meanwhile this is one way to do it (although I think there are much better ways to compare distributions):

      sysuse auto, clear
      contract foreign rep78 if !missing(foreign, rep78)
      egen _percent = pc(_freq), by(foreign)
      separate _percent, by(foreign)
      gen rep780 = rep78 - 0.2
      gen rep781 = rep78 + 0.2
      twoway bar _percent0 rep780, base(0) barw(0.4) bc(orange) ///
      || bar _percent1 rep781, barw(0.4) bc(blue) ytitle(Percent) xtitle(Repair record 1978) ///
      xtic(0.5/5.5) xla(, tlc(none))
      Click image for larger version

Name:	bihistogram.png
Views:	1
Size:	18.9 KB
ID:	1409065

      For three bars, I would use two offset variables, say

      gen xvar1 = x - 0.27
      gen xvar2 = x + 0.27
      and bars of width 0.27. See also

      SJ-7-1 gr0026 . . . . Stata tip 42: The overlay problem: Offset for clarity
      . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . J. Cui
      Q1/07 SJ 7(1):141--142 (no commands)
      tip for graphing several quantities on a continuous axis
      Last edited by Nick Cox; 04 Sep 2017, 06:04.


      • #4
        I looked for an example all can play with in which there were 3 predictor categories and about 11 response categories and came up with this.

        webuse nlswork
        twoway histogram birth_yr, by(race, note("") xrescale row(1)) horizontal width(1) freq subtitle(, fcolor(ltblue*0.2)) bfcolor(ltblue) blcolor(blue*0.5) yla(54 "1954" 41 "1941" 45 "1945" 50 "1950", ang(h)) discrete

        My point is not to make a claim about what's best, but to underline that there are choices here of various representations. What the graph in #1 is attempting is clear enough; how easy and effective it makes comparison is another question.

        Click image for larger version

Name:	birthyear_histogram.png
Views:	1
Size:	22.7 KB
ID:	1409213


        • #5
          Would this apply to my conundrum?
          I am looking to graph an overlaying bar chart of 2 different databases with different sample sizes. Where I think it is a problem of scale. The numbers in one datase are too small compared with the numbers in NHANES. What would you suggesting ? Use a secondary axis? Reduce both to a common scale, both axes? How would I code that?see example below, the two datasets are on different scales (x axis==bmi (continuous) and y axis==percent)

          the code used:
          (histogram bmi if bmi<41.87 & ageyears>20 & gender==1 , color(gray) percent )(histogram bmxbmi if bmxbmi<33.63 & ridageyr>20 & riagendr==1, fcolor(none) lcolor(black) percent ), legend(order(1 "dataset1 2 "dataset2" ))
          Attached Files
          Last edited by Hannah Jackson; 01 May 2018, 12:17. Reason: adding attachment


          • #6
            With transparency in Stata 15 superimposing histograms often works well but they have to use the same units to make sense. If you post the frequencies used in your graph concrete suggestions are likely to follow. What often is easiest is to scale first and then apply twoway bar.


            • #7
              Thank you Nick, would it work in Stata 13 or 14...that is what I have to use.


              • #8
                I guess my question is then "how to scale" ? Many thanks (from a stata Novice)!



                • #9
                  You tell us that you have Stata 13 or 14. That's not a large constraint on any solution.

                  Scaling is a matter of multiplication or division, so that to put say inches or mm on the other scale a conversion factor of 25.4 comes into play. Again, show an example to get precise advice. I can guess that one graph of yours uses 3 year bins (or is it 4?) and the other finer bins, but Statalist is not a forum which works well when we have to guess at your data or your code.

                  Being a novice is fine, but you have to give detail in a question if you expect detail in an answer. Already in this thread Friedrich asked the OP for an example in #2 and I commented similarly in #4. (OP has yet to reply, come to think of it!)