Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Histogram over groups

    I wanna replicate Figure 1 of this paper.
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float WAGE_ST byte STATE
       . 0
       . 0
       . 0
       5 0
     5.5 0
       5 0
       5 0
       5 0
    5.25 0
       5 0
       5 0
       5 0
       5 0
     5.5 0
       5 0
    5.25 0
       5 0
       5 0
       5 0
       5 0
       5 0
       5 0
       5 0
       5 0
       5 0
       5 0
       5 0
    4.25 0
    4.25 0
    4.25 0
    4.25 0
     4.5 0
    4.25 0
     4.5 0
    4.25 0
     4.5 0
    4.25 0
    4.67 0
    4.75 0
    4.25 0
    4.25 0
    4.75 0
    4.25 0
     4.5 0
    4.25 0
    4.25 0
    4.25 0
    4.25 0
    4.25 0
    4.25 0
    4.25 0
    4.25 0
     4.5 0
    4.75 0
    4.87 0
    4.75 0
     4.5 0
    4.25 0
    4.75 0
    4.25 0
    4.25 0
    4.75 0
     4.5 0
    4.25 0
    4.25 0
     4.5 0
     4.5 0
     4.5 0
     4.5 0
     4.5 0
    4.75 0
    4.25 0
    4.75 0
    4.25 0
    4.75 0
     4.5 0
    4.75 0
    4.25 0
    4.35 0
       . 1
       . 1
       . 1
       . 1
       . 1
       . 1
       . 1
       . 1
       . 1
       . 1
       . 1
       . 1
       . 1
       . 1
       . 1
       . 1
       . 1
       5 1
    5.12 1
    5.56 1
       5 1
    end
    
    twoway histogram WAGE_ST if STATE ==1, percent bcolor(red*0.5) || histogram WAGE_ST if STATE == 0, percent ///
    legend(order(1 "NJ" 2 "PA")) bcolor(blue*0.5)
    My overlays these respective histograms, but of course in the paper the bars for the groups are side by side. gr bar has the over option, which does this very nicely, but the histogram command lacks this. Is there a workaround for this? I know it isn't the end of the world, since I can just use by, but I was curious if there was a way to do this here.

  • #2
    Various ways to do it. Here is one.

    Code:
     
     * Example generated by -dataex-. For more info, type help dataex clear input float WAGE_ST byte STATE    . 0    . 0    . 0    5 0  5.5 0    5 0    5 0    5 0 5.25 0    5 0    5 0    5 0    5 0  5.5 0    5 0 5.25 0    5 0    5 0    5 0    5 0    5 0    5 0    5 0    5 0    5 0    5 0    5 0 4.25 0 4.25 0 4.25 0 4.25 0  4.5 0 4.25 0  4.5 0 4.25 0  4.5 0 4.25 0 4.67 0 4.75 0 4.25 0 4.25 0 4.75 0 4.25 0  4.5 0 4.25 0 4.25 0 4.25 0 4.25 0 4.25 0 4.25 0 4.25 0 4.25 0  4.5 0 4.75 0 4.87 0 4.75 0  4.5 0 4.25 0 4.75 0 4.25 0 4.25 0 4.75 0  4.5 0 4.25 0 4.25 0  4.5 0  4.5 0  4.5 0  4.5 0  4.5 0 4.75 0 4.25 0 4.75 0 4.25 0 4.75 0  4.5 0 4.75 0 4.25 0 4.35 0    . 1    . 1    . 1    . 1    . 1    . 1    . 1    . 1    . 1    . 1    . 1    . 1    . 1    . 1    . 1    . 1    . 1    5 1 5.12 1 5.56 1    5 1 end gen wage_group = 0.1 * floor(WAGE_ST/0.1) + 0.05  
    bysort STATE (wage_group) : gen total = _N 
    by STATE wage_group : gen percent = 100 * _N / total 
    gen wage_pos = wage_group + cond(STATE == 0, -0.025, 0.025) 
    separate percent, by(STATE) veryshortlabel 
    twoway bar percent? wage_pos if wage_pos < ., barw(0.04 ..) xtitle("WAGE_ST") ytitle(%, orient(horiz)) name(G1, replace) xtick(4.2(0.1)5.6)
    
    qplot WAGE_ST, over(STATE) connect(J J) name(G2, replace) legend(order(2 1)) xla(0(0.25)1, grid glp(solid) glw(vthin) glc(gs12))
    A histogram is perhaps oversold here, even noting economists of Nobel stature. In my view a quantile plot is more direct for comparing distributions. qplot is from the Stata Journal.

    Click image for larger version

Name:	ck1.png
Views:	1
Size:	33.0 KB
ID:	1727030
    Click image for larger version

Name:	ck2.png
Views:	1
Size:	42.8 KB
ID:	1727031


    Comment


    • #3
      You could use the las example here
      https://friosavila.github.io/stataviz/stataviz5.html

      Comment


      • #4
        These were most helpful. Thank you so much!

        Comment

        Working...
        X