Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Confusing Oaxaca output

    Hello. I am using the "oaxaca" package from Ben Jann and I've read the related Stata Journal article (https://journals.sagepub.com/doi/pdf...867X0800800401). I'm using Stata/MP 16.1. Data are from the American Community Survey.

    Goal is to explain differences in the log hourly wages of men and women (male=1 and female=2). Below is output from a two-fold decomposition both with and without exponentiated coefficients (eform option).

    Questions:
    1. Should the values of "explained" and "unexplained" add up to the value of the difference when using exponentiated coefficients? If they do not, why might that be? In the Stata Journal article about oaxaca, in the examples of two-fold decomposition (including the example using exponentiated coefficients), the values of "explained" and "unexplained" sum to the difference. In this case, they do in the version without exponentiated coefficients, but not in the exponentiated version (17.92% unexplained plus negative 8.2% explained does not equal 8% difference, though it's close).

    2. The value of "unexplained" is larger than the value of the difference. How should I interpret that?

    3. The value of "explained" seems to suggest that, if women had the same values of the other predictors as men, their average earnings would be about 8% less than they currently are (exponentiated coeficients). How should I interpret this? As evidence that the effect of the predictors on wages is different for women than for men?





    WITH EXPONENTIATED COEFFICIENTS

    . oaxaca ln_hourly race_eth3 age age_sq marital2 has_child edatt4 stem2, by(sex) eform pooled s
    > vy(,subpop(if $allstemfocal))
    (running oaxaca on estimation sample)

    BRR replications (80)
    ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
    .................................................. 50
    ..............................

    Blinder-Oaxaca decomposition Number of obs = 373,559
    Population size = 7,496,874
    Subpop. no. obs = 7,545
    Subpop. size = 168,977
    Replications = 80
    Design df = 79

    ------------------------------------------------------------------------------
    | BRR *
    ln_hourly | exp(b) Std. Err. t P>|t| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    overall |
    group_1 | 28.05908 .4327067 216.22 0.000 27.21088 28.93371
    group_2 | 25.93025 .4280831 197.19 0.000 25.09202 26.79648
    difference | 1.082098 .0233501 3.66 0.000 1.036605 1.129588
    explained | .917648 .0113864 -6.93 0.000 .8952616 .9405942
    unexplained | 1.179209 .0204376 9.51 0.000 1.139222 1.220598
    -------------+----------------------------------------------------------------
    explained |
    race_eth3 | -.0082964 .0020902 -3.97 0.000 -.0124568 -.0041361
    age | .0324024 .0244359 1.33 0.189 -.016236 .0810408
    age_sq | -.0272928 .0206412 -1.32 0.190 -.068378 .0137925
    marital2 | .0124589 .0026078 4.78 0.000 .0072683 .0176496
    has_child | .001147 .000957 1.20 0.234 -.0007578 .0030518
    edatt4 | -.0795574 .0063336 -12.56 0.000 -.0921642 -.0669507
    stem2 | -.0168032 .0044118 -3.81 0.000 -.0255847 -.0080217
    -------------+----------------------------------------------------------------
    unexplained |
    race_eth3 | -.1061984 .0314907 -3.37 0.001 -.1688791 -.0435178
    age | -.4272993 .529228 -0.81 0.422 -1.480701 .6261027
    age_sq | .2249712 .2747018 0.82 0.415 -.321809 .7717514
    marital2 | .1002231 .0658463 1.52 0.132 -.0308407 .231287
    has_child | .0379331 .0186572 2.03 0.045 .000797 .0750693
    edatt4 | -.0093948 .0523455 -0.18 0.858 -.1135859 .0947963
    stem2 | -.0844023 .053803 -1.57 0.121 -.1914946 .0226899
    _cons | .4290109 .2617127 1.64 0.105 -.0919152 .949937
    ------------------------------------------------------------------------------
    Note: Estimates are transformed only in the first equation.


    WITHOUT EXPONENTIATED COEFFICIENTS


    . oaxaca ln_hourly race_eth3 age age_sq marital2 has_child edatt4 stem2, by(sex) pooled svy(,su
    > bpop(if $allstemfocal))
    (running oaxaca on estimation sample)

    BRR replications (80)
    ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
    .................................................. 50
    ..............................

    Blinder-Oaxaca decomposition Number of obs = 373,559
    Population size = 7,496,874
    Subpop. no. obs = 7,545
    Subpop. size = 168,977
    Replications = 80
    Design df = 79

    ------------------------------------------------------------------------------
    | BRR *
    ln_hourly | Coef. Std. Err. t P>|t| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    overall |
    group_1 | 3.334312 .0154213 216.22 0.000 3.303617 3.365007
    group_2 | 3.25541 .016509 197.19 0.000 3.22255 3.28827
    difference | .0789021 .0215785 3.66 0.000 .0359511 .121853
    explained | -.0859414 .0124082 -6.93 0.000 -.1106394 -.0612435
    unexplained | .1648435 .0173316 9.51 0.000 .1303458 .1993413
    -------------+----------------------------------------------------------------
    explained |
    race_eth3 | -.0082964 .0020902 -3.97 0.000 -.0124568 -.0041361
    age | .0324024 .0244359 1.33 0.189 -.016236 .0810408
    age_sq | -.0272928 .0206412 -1.32 0.190 -.068378 .0137925
    marital2 | .0124589 .0026078 4.78 0.000 .0072683 .0176496
    has_child | .001147 .000957 1.20 0.234 -.0007578 .0030518
    edatt4 | -.0795574 .0063336 -12.56 0.000 -.0921642 -.0669507
    stem2 | -.0168032 .0044118 -3.81 0.000 -.0255847 -.0080217
    -------------+----------------------------------------------------------------
    unexplained |
    race_eth3 | -.1061984 .0314907 -3.37 0.001 -.1688791 -.0435178
    age | -.4272993 .529228 -0.81 0.422 -1.480701 .6261027
    age_sq | .2249712 .2747018 0.82 0.415 -.321809 .7717514
    marital2 | .1002231 .0658463 1.52 0.132 -.0308407 .231287
    has_child | .0379331 .0186572 2.03 0.045 .000797 .0750693
    edatt4 | -.0093948 .0523455 -0.18 0.858 -.1135859 .0947963
    stem2 | -.0844023 .053803 -1.57 0.121 -.1914946 .0226899
    _cons | .4290109 .2617127 1.64 0.105 -.0919152 .949937
    ------------------------------------------------------------------------------









  • #2
    Edit: In the Stata Journal example using exponentiated coefficients, explained and unexplained do not actually add up exactly to the difference, thought it's close (as in my result). Thanks in advance for any assistance with these interpretations!
    Last edited by Serena Hinz; 03 Sep 2020, 18:29.

    Comment


    • #3
      1. Should the values of "explained" and "unexplained" add up to the value of the difference when using exponentiated coefficients? If they do not, why might that be? In the Stata Journal article about oaxaca, in the examples of two-fold decomposition (including the example using exponentiated coefficients), the values of "explained" and "unexplained" sum to the difference. In this case, they do in the version without exponentiated coefficients, but not in the exponentiated version (17.92% unexplained plus negative 8.2% explained does not equal 8% difference, though it's close).
      I believe that "explained" and "unexplained" should add up to the difference. For the exponentiated version "explained" and "unexplained" need to be multiplied to get the exponentiated difference.

      2. The value of "unexplained" is larger than the value of the difference. How should I interpret that?
      I don't think that you can interpret this value on its own but only together with the other two values. They tell you that a part of the wage difference is reduced by the negative endowment effect.

      3. The value of "explained" seems to suggest that, if women had the same values of the other predictors as men, their average earnings would be about 8% less than they currently are (exponentiated coeficients). How should I interpret this? As evidence that the effect of the predictors on wages is different for women than for men?
      I agree with your interpretation. I would interpret it as evidence that women need to have 8% better characteristics than men to earn the same wage.

      Comment

      Working...
      X