Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Oaxaca decomposition interpretation - pooled method and reference category

    Hello,

    I'm using the Blinder-Oaxaca decomposition to study the wage differentials between males and females. I used logarithm or hourly wages to do that and reported the pooled method. I read a lot of research papers including Jann's paper and I feel like I understand 100% what the decomposition equations mean. However, I'm a little confused if 1) my results make sense; 2) how to make interpretations for these coefficients and 3) will I need to pick a reference category (because I thought we are always comparing same category between males and females so for example, looking at the married variable, I'm comparing married females to married males.. so won't married males be automatically considered the reference here for me.. same for educational attainment, when reporting a female with a bachelor degree, isn't a male with a bachelor degree considered automatically to be the reference group). Looking at one of my results below, I would make these conclusions but I'm not sure if that's the right way to do it... am I using percentage points versus percent correctly?

    1- The total gender pay gap in 2001, calculated with hourly wages was 28 percentage points.

    2- Differences in the characteristics of males and females explains less than half of the total gap. (E.g. 0.0775254/0.2759044 = 28 percent)

    3- Age is estimated to explain 0.15808 percentage points of the total wage gap of 0.2759044 or 57.6% of the total gap. However, the effect of age tapper over time (looking at age square having a negative sign

    4- Having a high school degree (a female with a high school compared to a male with high school) is also contributing to an increase in the gender wage gap by 0.0002263 percentage points (or 0.0002263/0.2759044 = 0.08 percent)

    5- Completing a bachelor’s degree (a female with a bachelor degree compared to a male with bachelor degree) will reduce the gender wage gap by 4 percent (-0.113391/0.2759044)

    6- can do the same with single females/single males, and married females/married males? right

    7- Looking at working full time (FTW), is this coefficients comparing females working full time versus males working full time? or females working full time versus females working part time (which I think would be wrong)?. how would this be interpreted?




    Blinder-Oaxaca decomposition Number of obs = 30,330
    Model = linear
    Group 1: female = 0 N of obs 1 = 15540
    Group 2: female = 1 N of obs 2 = 14790

    ------------------------------------------------------------------------------
    | Robust
    lnHwages | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    overall |
    group_1 | 10.28898 .0040482 2541.60 0.000 10.28105 10.29692
    group_2 | 10.01308 .0041629 2405.33 0.000 10.00492 10.02124
    difference | .2759044 .0058067 47.51 0.000 .2645235 .2872853
    explained | .0775254 .0030875 25.11 0.000 .0714741 .0835767
    unexplained | .198379 .0054102 36.67 0.000 .1877752 .2089828
    -------------+----------------------------------------------------------------
    explained |
    hschool | .0002263 .0001441 1.57 0.116 -.000056 .0005087
    Bachdegree | -.0113391 .0009713 -11.67 0.000 -.0132429 -.0094353
    Married | .0078862 .0011745 6.71 0.000 .0055842 .0101881
    Single | -.0018602 .0005965 -3.12 0.002 -.0030293 -.0006911
    kid | -.0003125 .0001162 -2.69 0.007 -.0005403 -.0000848
    age | .15808 .0074566 21.20 0.000 .1434652 .1726947
    age2 | -.145693 .0064581 -22.56 0.000 -.1583506 -.1330354
    FTW | .0652989 .0015793 41.35 0.000 .0622035 .0683942
    -------------+----------------------------------------------------------------
    unexplained |
    hschool | .0234703 .0026166 8.97 0.000 .0183419 .0285987
    Bachdegree | .0008703 .0014974 0.58 0.561 -.0020646 .0038053
    Married | .0462083 .0435611 1.06 0.289 -.0391699 .1315866
    Single | .0074362 .0052602 1.41 0.157 -.0028736 .017746
    kid | .0080418 .007342 1.10 0.273 -.0063482 .0224318
    age | .4657228 .1505767 3.09 0.002 .1705979 .7608478
    age2 | -.0623104 .0809454 -0.77 0.441 -.2209605 .0963397
    FTW | .1212236 .016439 7.37 0.000 .0890037 .1534436
    _cons | -.3995266 .0858276 -4.65 0.000 -.5677457 -.2313076

    ------------------------------------------------------------------------------

    Any help will be greatly appreciated

    Thank you

  • #2
    I would really appreciate any help with my questions? especially about how to figure out the reference category for the oaxaca decomposition. Thank you

    Comment


    • #3
      Another couple of questions please:

      1- when trying to get the female and male wage equations in STATA. Is it just a regular regression such as: by gender: regress lnhwages education marital status children full time work

      2- I'm looking to get the average hourly wages for men versus the average hourly wages for women so I use: sum wage if gender==1 and sum wage if gender==0. So let's say I get a ratio of $15/$30= 50 percent. Is that called the wage gap too??? because when I run the oaxaca decomposition, the difference (explained + unexplained) I'm getting is not the same as 50% (usually less)? does that make sense? or I'm doing something wrong.


      Just want to mention that I do not have my results for the above because, due to privacy reason, the university won't let me take any of my results out until they get vetted first (a long process). I would appreciate any help. Thank you

      Comment


      • #4
        Hello, I still did not get any reply and I would really appreciate any help. I got my results now (below) and I was hoping if I can get help in interpreting the decomposition (wage gap between females and males). This is what I have

        -An indicator variable for education is set equal to 1 if the individual has a bachelor degree (Ibach_1) or higher and 0 if the individual has an education lower than a bachelor degree
        - An indicator variable for marital status is set equal to 1 (Imar_1) if the individual is married and set equal to 0 otherwise.
        - For the presence of children, an indicator variable is set to 1 if there is children (Ichild_1) in the house and set to 0 otherwise.
        - An indicator variable is set to 1if an individual is working full time (Iwork_1) (30 hours or more per week) and 0 otherwise.

        I think that the interpretation would be for example,

        - Being married explain 0.0016 of the total 0.527 log wage differential between males and females (is that correct?). What is confusing me here is: does that mean that we are comparing a married female with a married male or are we comparing being married in general (male or female) to my reference group (0 = otherwise).

        - Same with working full time, will the interpretation be something like: working full time explains 0.156 of the 0.527 (meaning that I'm comparing a female working full time with a male working full time) OR am I comparing working full time versus working part time?

        - The variable age square is negative which, from my understanding, means that the effect of age tapper over time. Does that mean that the gender wage gap will decrease as females and males get older??? so the gap between lets say a 45 years old male and a 45 years old female will be lower than the gap between males and females who are for example 30 years old? and if that's correct, how would I interpret the -0.1007 coefficient.


        I would really appreciate if I can get any help.

        Thank you
        ------------------------------------------------------------------------------
        | Robust
        lnhwages | Coef. Std. Err. z P>|z| [95% Conf. Interval]
        -------------+----------------------------------------------------------------
        overall |
        group_1 | 2.44436 .003485 701.39 0.000 2.437529 2.45119
        group_2 | 1.917152 .0040444 474.03 0.000 1.909226 1.925079
        difference | .5272072 .0053388 98.75 0.000 .5167433 .537671
        explained | .1402948 .0027205 51.57 0.000 .1349627 .1456269
        unexplained | .3869123 .0048965 79.02 0.000 .3773153 .3965094
        -------------+----------------------------------------------------------------
        explained |
        age | .0990073 .0052733 18.78 0.000 .0886719 .1093427
        age2 | -.1007936 .005107 -19.74 0.000 -.1108031 -.0907841
        _Imar_1 | .0016793 .0003274 5.13 0.000 .0010376 .0023209
        _Ibach_1 | -.0152759 .0011953 -12.78 0.000 -.0176185 -.0129332
        _Ichild_1 | -.0009208 .0002129 -4.32 0.000 -.0013381 -.0005034
        _Iwork_1 | .1565985 .0022097 70.87 0.000 .1522675 .1609294
        -------------+----------------------------------------------------------------
        unexplained |
        age | -.0424122 .2000427 -0.21 0.832 -.4344887 .3496642
        age2 | .0592419 .1034631 0.57 0.567 -.1435421 .2620259
        _Imar_1 | .168149 .0087593 19.20 0.000 .1509812 .1853169
        _Ibach_1 | -.0275075 .0017457 -15.76 0.000 -.030929 -.0240859
        _Ichild_1 | .0226635 .0068612 3.30 0.001 .0092157 .0361112
        _Iwork_1 | .0283806 .016019 1.77 0.076 -.0030161 .0597772
        _cons | .1783971 .0967319 1.84 0.065 -.0111939 .3679881
        ------------------------------------------------------------------------------

        Comment


        • #5
          Any help please?

          Comment


          • #6
            .

            Comment


            • #7
              You probably don't get answers here, because your questions have little to do with Stata per see. Regarding the base category problem, you can re-read the oaxaca helpfile section about the normalisation of categorical variables
              Code:
              help oaxaca##norm
              In my interpretation, it means that getting the coefficients for categorical variables correct is tricky. What you get is
              ... "normalized" effects, i.e. effects that are expressed as deviation contrasts from the grand mean ...
              Overall, I think that your interpretations are correct.
              What is confusing me here is: does that mean that we are comparing a married female with a married male ...
              Remember how dummy variables work in regressions. They are constants like the intercept, so your first idea should be correct.

              1 - when trying to get the female and male wage equations in STATA. Is it just a regular regression such as: by gender: regress lnhwages education marital status children full time work
              When you look at the code for the oaxaca command, then you should see that instead of
              Code:
              by gender: regress lnhwages education marital status children full time work
              It is rather
              Code:
              forvalues g=1/2{
              regress lnhwages education marital status children full time work if gender==`g'
              }
              2- I'm looking to get the average hourly wages for men versus the average hourly wages for women so I use: sum wage if gender==1 and sum wage if gender==0. So let's say I get a ratio of $15/$30= 50 percent. Is that called the wage gap too??? because when I run the oaxaca decomposition, the difference (explained + unexplained) I'm getting is not the same as 50% (usually less)? does that make sense? or I'm doing something wrong.
              You should get roughly the same result when you condition on all the variables in your regression.
              Code:
              bysort gender: sum wage if !missing(education, marital status, children, full time work)

              Comment


              • #8
                Thank you so much. Really appreciate it.

                Comment

                Working...
                X