Oaxaca Blinder Decomposition

Theresia Verena

Join Date: Sep 2021

Posts: 27
#1

Oaxaca Blinder Decomposition

11 Nov 2021, 23:31

Hi, I want to use Oaxaca Blinder Decomposition and Heckit for my final project.

Both are used to analyze the difference in wages between different types of disability and non-disability. There are three types of disabilities, namely blind, deaf, and speech impaired. The three variables are dummy variables. As an example:
deaf=1 if the individual has a hearing problem, 0 otherwise.
Non-disability consists of individuals who do not have hearing, vision, and speech problems.

Code:

gen nondisbld = 1 if deaf==0&blind==0&speech_impaired==0

How does the command analyze it in terms of, for example, the difference in wages between the deaf and non-disabled groups.

This is my dataex

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input str2 id byte(blind deaf speech_impaired yeduc krt) float(lwage_m1 nondisbld) "1" 1 0 0 9 0 13.458836 0 "2" 0 0 1 0 0 14.220976 0 "3" 0 0 0 9 0 14.220976 1 "4" 0 1 0 9 1 14.220976 0 "5" 0 0 0 16 1 14.220976 1 "6" 1 0 0 13 1 13.304685 0 "7" 0 0 0 11 1 14.7318 1 "8" 0 0 0 2 1 13.81551 1 "9" 0 0 0 6 0 . 1 "10" 0 0 1 4 0 . 0 end

With description:
id: is resident id
lwage_m1: log wages per month
krt : a variable that only exists in the selection model but is not in the wage equation.
Thank you in advance!
Tags: None
Fei Wang

Join Date: Oct 2021

Posts: 726
#2

12 Nov 2021, 00:27

Theresia, I would do the following.

Step one: If I don't have appropriate variables for the equations of selection into labor markets, then I would directly go to Step two. If I have these variables for both deaf and non-disabled, then I would try -heckman- for the two subsamples separately. After -heckman-, I can generate inverse mills ratios for each subsample, and generate a single variable to store IMRs from both subsamples.

Step two: Use -oaxaca-, a user-written command, to do a standard decomposition. If I did Step one, then the IMRs would be a new regressor participating the decomposition.
Comment
Theresia Verena

Join Date: Sep 2021

Posts: 27
#3

12 Nov 2021, 00:40

Originally posted by Fei Wang View Post

Theresia, I would do the following.

Step one: If I don't have appropriate variables for the equations of selection into labor markets, then I would directly go to Step two. If I have these variables for both deaf and non-disabled, then I would try -heckman- for the two subsamples separately. After -heckman-, I can generate inverse mills ratios for each subsample, and generate a single variable to store IMRs from both subsamples.

Step two: Use -oaxaca-, a user-written command, to do a standard decomposition. If I did Step one, then the IMRs would be a new regressor participating the decomposition.

In the dummy deaf variable, the value will be equal to 0 if the individual has no hearing problems, so it may be 0 even if the individual has other disabilities. How does the Oaxaca-Blinder decomposition command compare the wages of those who actually have no vision, hearing, or speech problems with people who have vision problems? I'm a little confused because I used 2 dummy variables for 1 Oaxaca decomposition.
Comment
Fei Wang

Join Date: Oct 2021

Posts: 726
#4

12 Nov 2021, 00:47

Originally posted by Theresia Verena View Post

In the dummy deaf variable, the value will be equal to 0 if the individual has no hearing problems, so it may be 0 even if the individual has other disabilities. How does the Oaxaca-Blinder decomposition command compare the wages of those who actually have no vision, hearing, or speech problems with people who have vision problems? I'm a little confused because I used 2 dummy variables for 1 Oaxaca decomposition.

Do the decomposition only with samples of deaf and non-disabled -- Add "if deaf | nondisbld" to the oaxaca code. When you switch to other pairs, use different subsamples.
Comment
Theresia Verena

Join Date: Sep 2021

Posts: 27
#5

12 Nov 2021, 01:34

Originally posted by Fei Wang View Post

Do the decomposition only with samples of deaf and non-disabled -- Add "if deaf | nondisbld" to the oaxaca code. When you switch to other pairs, use different subsamples.

So, if I decompose only for deaf and nondisabled, the command will look like this?

Code:

oaxaca lwage_m1 yeduc if deaf | nondisbld, by(deaf) adjust (mills)

sorry if im wrong
Comment
Fei Wang

Join Date: Oct 2021

Posts: 726
#6

12 Nov 2021, 02:32

Originally posted by Theresia Verena View Post

So, if I decompose only for deaf and nondisabled, the command will look like this?

Code:

oaxaca lwage_m1 yeduc if deaf | nondisbld, by(deaf) adjust (mills)

sorry if im wrong

I didn't realize that -oaxaca- itself can do the heckman selection. Then you may not need to do Step one in #2, and directly add the exclusive variables for selection in -adjust()-. Other than that, I would use -pooled- option and -vce(robust)-.

Code:

oaxaca lwage_ml yeduc other_variables if deaf | nondisbld, by(deaf) adjust(selection_variables) pooled vce(robust)

The "selection_variables" are a list of variables affecting individuals' entry to labor markets but not directly influencing wage, and therefore should not be included in "other_variables".
Comment
Theresia Verena

Join Date: Sep 2021

Posts: 27
#7

12 Nov 2021, 21:29

Originally posted by Fei Wang View Post

I didn't realize that -oaxaca- itself can do the heckman selection. Then you may not need to do Step one in #2, and directly add the exclusive variables for selection in -adjust()-. Other than that, I would use -pooled- option and -vce(robust)-.

Code:

oaxaca lwage_ml yeduc other_variables if deaf | nondisbld, by(deaf) adjust(selection_variables) pooled vce(robust)

The "selection_variables" are a list of variables affecting individuals' entry to labor markets but not directly influencing wage, and therefore should not be included in "other_variables".

Thank you very much, Fei Wang. You really helped me. May you always be healthy
Comment
Theresia Verena

Join Date: Sep 2021

Posts: 27
#8

14 Nov 2021, 01:01

Hello Fei Wang, I'm sorry to bother you again regarding this. I've tried to use this command in my data,

Code:

oaxaca lwage_m1 yeduc other_veriables if deaf| nondisbld, by(deaf) adjust(selection_variables) pooled vce(robust)

but can't.
So, I tried to use this command:

Code:

oaxaca lwage_m1 yeduc other_variables if deaf| nondisbld, by(deaf) model1(heckman, twostep select (selection_variables)) model2 (heckman, twostep select (selection_variables)) weight (0) noisily relax

The results are many that are not significant in the difference and unexplained component. Is there anything I can do to test/ resolve this issue? Thank you in advance!
Comment
Theresia Verena

Join Date: Sep 2021

Posts: 27
#9

14 Nov 2021, 01:50

Originally posted by Fei Wang View Post

I didn't realize that -oaxaca- itself can do the heckman selection. Then you may not need to do Step one in #2, and directly add the exclusive variables for selection in -adjust()-. Other than that, I would use -pooled- option and -vce(robust)-.

Code:

oaxaca lwage_ml yeduc other_variables if deaf | nondisbld, by(deaf) adjust(selection_variables) pooled vce(robust)

The "selection_variables" are a list of variables affecting individuals' entry to labor markets but not directly influencing wage, and therefore should not be included in "other_variables".

Hello Fei Wang, I'm sorry to bother you again regarding this. I've tried to use this command in my data,

Code:

oaxaca lwage_m1 yeduc other_veriables if deaf| nondisbld, by(deaf) adjust(selection_variables) pooled vce(robust)

but can't.
So, I tried to use this command:

Code:

oaxaca lwage_m1 yeduc other_variables if deaf| nondisbld, by(deaf) model1(heckman, twostep select (selection_variables)) model2 (heckman, twostep select (selection_variables)) weight (0) noisily relax

The results are many that are not significant in the difference and unexplained component. Is there anything I can do to test/ resolve this issue? Thank you in advance!
Comment

Fei Wang

Join Date: Oct 2021
Posts: 726

#10

14 Nov 2021, 03:19

Theresia, sorry for misleading you in #6 regarding the -adjust()- option. Let me correct model specifications using the example below. There are two similar ways of doing decompositions with selection adjustment. Case 1 uses -model1- and -model2-, and Case 2 uses -adjust()- after manually running -heckman- for both groups. You may see that the coefficients are identical but the standard errors in Case 1 are overall greater than those in Case 2. I guess one reason is that Case 2 fails to adjust SEs for generated inverse mills ratios. Therefore, Case 1, which you may not prefer because of weaker significance, should be the correct way. Back to #9, I see no problem in the last line of your codes, therefore what you get would be the final results, though not sufficiently significant.

Code:

. * Load data
.         use http://fmwww.bc.edu/RePEc/bocode/o/oaxaca.dta, clear
(Excerpt from the Swiss Labor Market Survey 1998)

.  
. * Case 1: Oaxaca decomposition with heckman selection adjustment for both groups
.         oaxaca lnwage educ exper tenure, by(female) model1(heckman, select(age agesq) 
> twostep) model2(heckman, select(age agesq) twostep)

Blinder-Oaxaca decomposition                    Number of obs     =      1,434
                                                  Model           =     linear
Group 1: female = 0                               N of obs 1      =        751
Group 2: female = 1                               N of obs 2      =        683

------------------------------------------------------------------------------
      lnwage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
overall      |
     group_1 |   3.443838   .0340346   101.19   0.000     3.377131    3.510544
     group_2 |     2.6307   .4344958     6.05   0.000     1.779104    3.482296
  difference |   .8131382   .4358267     1.87   0.062    -.0410664    1.667343
  endowments |   .0586748   .0281703     2.08   0.037      .003462    .1138876
coefficients |   .7217082   .4354452     1.66   0.097    -.1317487    1.575165
 interaction |   .0327552   .0277066     1.18   0.237    -.0215486    .0870591
-------------+----------------------------------------------------------------
endowments   |
        educ |   .0480125   .0158107     3.04   0.002     .0170242    .0790009
       exper |   .0094518   .0158033     0.60   0.550    -.0215221    .0404256
      tenure |   .0012105    .019705     0.06   0.951    -.0374107    .0398317
-------------+----------------------------------------------------------------
coefficients |
        educ |  -.0035644   .2387204    -0.01   0.988    -.4714479     .464319
       exper |   .0612828   .1000946     0.61   0.540    -.1348991    .2574647
      tenure |   .0637608   .0563644     1.13   0.258    -.0467114     .174233
       _cons |   .6002291   .4361519     1.38   0.169    -.2546129    1.455071
-------------+----------------------------------------------------------------
interaction  |
        educ |  -.0001848   .0123735    -0.01   0.988    -.0244363    .0240668
       exper |   .0097907   .0162015     0.60   0.546    -.0219637    .0415451
      tenure |   .0231492   .0208403     1.11   0.267     -.017697    .0639955
------------------------------------------------------------------------------

. 
. * Case 2: Manually generate inverse mills ratios for both groups and adjust them in oa
> xaca decomposition
.         quietly heckman lnwage educ exper tenure if female, select(age agesq) twostep 
> mills(imr)

.         quietly heckman lnwage educ exper tenure if !female, select(age agesq) twostep
>  mills(imr2)

.         quietly replace imr = imr2 if !female

.         oaxaca lnwage educ exper tenure imr, by(female) adjust(imr)

Blinder-Oaxaca decomposition                    Number of obs     =      1,434
                                                  Model           =     linear
Group 1: female = 0                               N of obs 1      =        751
Group 2: female = 1                               N of obs 2      =        683

------------------------------------------------------------------------------
      lnwage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
overall      |
     group_1 |   3.440222   .0174958   196.63   0.000     3.405931    3.474513
     group_2 |   3.266761   .0218647   149.41   0.000     3.223907    3.309615
  difference |   .1734607    .028003     6.19   0.000     .1185759    .2283456
-------------+----------------------------------------------------------------
adjusted     |
     group_1 |   3.443838   .0341005   100.99   0.000     3.377002    3.510674
     group_2 |     2.6307   .1634445    16.10   0.000     2.310354    2.951045
  difference |   .8131382   .1669639     4.87   0.000     .4858949    1.140382
  endowments |   .0586748   .0154532     3.80   0.000     .0283872    .0889624
coefficients |   .7217082    .165961     4.35   0.000     .3964307    1.046986
 interaction |   .0327552   .0146017     2.24   0.025     .0041364     .061374
-------------+----------------------------------------------------------------
endowments   |
        educ |   .0480125    .011635     4.13   0.000     .0252084    .0708167
       exper |   .0094518   .0073361     1.29   0.198    -.0049267    .0238303
      tenure |   .0012105    .008615     0.14   0.888    -.0156746    .0180956
-------------+----------------------------------------------------------------
coefficients |
        educ |  -.0035644   .1198293    -0.03   0.976    -.2384255    .2312966
       exper |   .0612828   .0484443     1.27   0.206    -.0336663    .1562319
      tenure |   .0637608    .028222     2.26   0.024     .0084468    .1190749
       _cons |   .6002291   .1767357     3.40   0.001     .2538335    .9466246
-------------+----------------------------------------------------------------
interaction  |
        educ |  -.0001848   .0062112    -0.03   0.976    -.0123584    .0119889
       exper |   .0097907   .0081649     1.20   0.230    -.0062122    .0257937
      tenure |   .0231492   .0109789     2.11   0.035      .001631    .0446675
------------------------------------------------------------------------------
(adjusted by imr)

Comment

Theresia Verena

Join Date: Sep 2021
Posts: 27

#11

14 Nov 2021, 03:44

Originally posted by Fei Wang View Post

Code:

. * Load data
. use http://fmwww.bc.edu/RePEc/bocode/o/oaxaca.dta, clear
(Excerpt from the Swiss Labor Market Survey 1998)

.
. * Case 1: Oaxaca decomposition with heckman selection adjustment for both groups
. oaxaca lnwage educ exper tenure, by(female) model1(heckman, select(age agesq)
> twostep) model2(heckman, select(age agesq) twostep)

Blinder-Oaxaca decomposition Number of obs = 1,434
Model = linear
Group 1: female = 0 N of obs 1 = 751
Group 2: female = 1 N of obs 2 = 683

------------------------------------------------------------------------------
lnwage | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
overall |
group_1 | 3.443838 .0340346 101.19 0.000 3.377131 3.510544
group_2 | 2.6307 .4344958 6.05 0.000 1.779104 3.482296
 difference | .8131382 .4358267 1.87 0.062 -.0410664 1.667343
endowments | .0586748 .0281703 2.08 0.037 .003462 .1138876
coefficients | .7217082 .4354452 1.66 0.097 -.1317487 1.575165
interaction | .0327552 .0277066 1.18 0.237 -.0215486 .0870591
-------------+----------------------------------------------------------------
endowments |
educ | .0480125 .0158107 3.04 0.002 .0170242 .0790009
exper | .0094518 .0158033 0.60 0.550 -.0215221 .0404256
tenure | .0012105 .019705 0.06 0.951 -.0374107 .0398317
-------------+----------------------------------------------------------------
coefficients |
educ | -.0035644 .2387204 -0.01 0.988 -.4714479 .464319
exper | .0612828 .1000946 0.61 0.540 -.1348991 .2574647
tenure | .0637608 .0563644 1.13 0.258 -.0467114 .174233
_cons | .6002291 .4361519 1.38 0.169 -.2546129 1.455071
-------------+----------------------------------------------------------------
interaction |
educ | -.0001848 .0123735 -0.01 0.988 -.0244363 .0240668
exper | .0097907 .0162015 0.60 0.546 -.0219637 .0415451
tenure | .0231492 .0208403 1.11 0.267 -.017697 .0639955
------------------------------------------------------------------------------

.
. * Case 2: Manually generate inverse mills ratios for both groups and adjust them in oa
> xaca decomposition
. quietly heckman lnwage educ exper tenure if female, select(age agesq) twostep
> mills(imr)

. quietly heckman lnwage educ exper tenure if !female, select(age agesq) twostep
> mills(imr2)

. quietly replace imr = imr2 if !female

. oaxaca lnwage educ exper tenure imr, by(female) adjust(imr)

Blinder-Oaxaca decomposition Number of obs = 1,434
Model = linear
Group 1: female = 0 N of obs 1 = 751
Group 2: female = 1 N of obs 2 = 683

------------------------------------------------------------------------------
lnwage | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
overall |
group_1 | 3.440222 .0174958 196.63 0.000 3.405931 3.474513
group_2 | 3.266761 .0218647 149.41 0.000 3.223907 3.309615
difference | .1734607 .028003 6.19 0.000 .1185759 .2283456
-------------+----------------------------------------------------------------
adjusted |
group_1 | 3.443838 .0341005 100.99 0.000 3.377002 3.510674
group_2 | 2.6307 .1634445 16.10 0.000 2.310354 2.951045
 difference | .8131382 .1669639 4.87 0.000 .4858949 1.140382
endowments | .0586748 .0154532 3.80 0.000 .0283872 .0889624
coefficients | .7217082 .165961 4.35 0.000 .3964307 1.046986
interaction | .0327552 .0146017 2.24 0.025 .0041364 .061374
-------------+----------------------------------------------------------------
endowments |
educ | .0480125 .011635 4.13 0.000 .0252084 .0708167
exper | .0094518 .0073361 1.29 0.198 -.0049267 .0238303
tenure | .0012105 .008615 0.14 0.888 -.0156746 .0180956
-------------+----------------------------------------------------------------
coefficients |
educ | -.0035644 .1198293 -0.03 0.976 -.2384255 .2312966
exper | .0612828 .0484443 1.27 0.206 -.0336663 .1562319
tenure | .0637608 .028222 2.26 0.024 .0084468 .1190749
_cons | .6002291 .1767357 3.40 0.001 .2538335 .9466246
-------------+----------------------------------------------------------------
interaction |
educ | -.0001848 .0062112 -0.03 0.976 -.0123584 .0119889
exper | .0097907 .0081649 1.20 0.230 -.0062122 .0257937
tenure | .0231492 .0109789 2.11 0.035 .001631 .0446675
------------------------------------------------------------------------------
(adjusted by imr)

Is it okay if the results are not significant? For differences and unexplained component?

Announcement