Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Plotting the difference of two kernel densities

    Dear All,

    I am trying plot the difference of two kernel densities over quantiles.

    Here is what I am doing first:
    Code:
    use http://fmwww.bc.edu/RePEc/bocode/o/oaxaca.dta, clear
    
    twoway (kdensity lnwage if female == 0 [aweight=wt]) (kdensity lnwage if female == 1 [aweight=wt]), ytitle(Density) xtitle(Log Hourly Wages) legend(order(1 "Male" 2 "Female"))
    So these are the male and female wage distributions. I want to plot the difference over various quantiles and obtain a graph like figure 3 in this paper:
    https://www.sciencedirect.com/scienc...47596707000844

    Here is what I have tried:
    Code:
    kdensity lnwage if female == 0 [aweight=wt], gen(male_x male_d)
    kdensity lnwage if female == 1 [aweight=wt], gen(fem_x fem_d)
    After this, I am unsure. Can someone please suggest the next steps or point to a more direct way of doing this?

    Thanks in advance.

  • #2
    I'd start by estimating the densities at the same grid of points. This may help; I may be missing a smarter way to do it. 0.25 and 5.5 were just points a bit beyond the empirical range and 106 is a matter of taste rather than technique.

    Code:
    use http://fmwww.bc.edu/RePEc/bocode/o/oaxaca.dta, clear
    
    range whereat 0.25 5.5 106 
    
    kdensity lnwage if female == 0 [aweight=wt], at(whereat) gen(garbage1 male_d)
    kdensity lnwage if female == 1 [aweight=wt], at(whereat) gen(garbage2 fem_d)
    
    gen diff = male_d - fem_d
    line *_d diff whereat

    Comment


    • #3
      Originally posted by Nick Cox View Post
      I'd start by estimating the densities at the same grid of points. This may help; I may be missing a smarter way to do it. 0.25 and 5.5 were just points a bit beyond the empirical range and 106 is a matter of taste rather than technique.

      Code:
      use http://fmwww.bc.edu/RePEc/bocode/o/oaxaca.dta, clear
      
      range whereat 0.25 5.5 106
      
      kdensity lnwage if female == 0 [aweight=wt], at(whereat) gen(garbage1 male_d)
      kdensity lnwage if female == 1 [aweight=wt], at(whereat) gen(garbage2 fem_d)
      
      gen diff = male_d - fem_d
      line *_d diff whereat
      Thank you so much!!!

      Just a small query: should 106 depend on how many observations I have by any chance or its just arbitrary?
      Last edited by Alina Faruk; 23 Jul 2019, 12:51.

      Comment


      • #4
        Originally posted by Nick Cox View Post
        I'd start by estimating the densities at the same grid of points. This may help; I may be missing a smarter way to do it. 0.25 and 5.5 were just points a bit beyond the empirical range and 106 is a matter of taste rather than technique.

        Code:
        use http://fmwww.bc.edu/RePEc/bocode/o/oaxaca.dta, clear
        
        range whereat 0.25 5.5 106
        
        kdensity lnwage if female == 0 [aweight=wt], at(whereat) gen(garbage1 male_d)
        kdensity lnwage if female == 1 [aweight=wt], at(whereat) gen(garbage2 fem_d)
        
        gen diff = male_d - fem_d
        line *_d diff whereat
        Also, what if I wanted to plot the difference of lnwage_male-lnwage_female over quantiles?

        Code:
         
         use http://fmwww.bc.edu/RePEc/bocode/o/oaxaca.dta, clear    
         kdensity lnwage if female == 0 [aweight=wt], gen(male_x male_d) kdensity lnwage if female == 1 [aweight=wt], gen(fem_x fem_d)  gen gap=male_x-female_x
        How to plot variable gap over the quantiles?

        Comment

        Working...
        X