Hi all,
I'm using the oaxaca command to decompose wage differentials. I'm looking for some clarification on using normalize() for a 0/1 dummy variable. Consider the following example:
Here's the output:
Blinder-Oaxaca decomposition Number of obs = 1,434
Model = linear
Group 1: female = 0 N of obs 1 = 751
Group 2: female = 1 N of obs 2 = 683
------------------------------------------------------------------------------
lnwage | Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
overall |
group_1 | 3.440222 .0174928 196.66 0.000 3.405937 3.474507
group_2 | 3.266761 .0218657 149.40 0.000 3.223905 3.309617
difference | .1734607 .028002 6.19 0.000 .1185779 .2283436
endowments | .0833608 .015955 5.22 0.000 .0520897 .114632
coefficients | .1041363 .0256305 4.06 0.000 .0539015 .1543711
interaction | -.0140364 .0125167 -1.12 0.262 -.0385688 .0104959
-------------+----------------------------------------------------------------
endowments |
single1 | -.0001387 .0007305 -0.19 0.849 -.0015705 .0012932
single2 | -.0001387 .0007305 -0.19 0.849 -.0015705 .0012932
educ | .0514044 .0123024 4.18 0.000 .0272921 .0755167
exper | .0243096 .0086225 2.82 0.005 .0074099 .0412093
tenure | .0079242 .0086147 0.92 0.358 -.0089603 .0248086
-------------+----------------------------------------------------------------
coefficients |
single1 | .0698379 .0167273 4.18 0.000 .037053 .1026227
single2 | -.0440028 .0106675 -4.12 0.000 -.0649108 -.0230949
educ | -.154297 .1180917 -1.31 0.191 -.3857526 .0771585
exper | -.0838708 .0409529 -2.05 0.041 -.164137 -.0036046
tenure | .0233765 .0268519 0.87 0.384 -.0292522 .0760053
_cons | .2930927 .1331797 2.20 0.028 .0320653 .55412
-------------+----------------------------------------------------------------
interaction |
single1 | -.0005633 .0029394 -0.19 0.848 -.0063245 .0051979
single2 | -.0005633 .0029394 -0.19 0.848 -.0063245 .0051979
educ | -.0079976 .0063641 -1.26 0.209 -.0204711 .0044759
exper | -.0133994 .0074483 -1.80 0.072 -.0279979 .001199
tenure | .0084872 .0098556 0.86 0.389 -.0108294 .0278037
My question concerns the dummy variable for single (single1 -> not single, single2 -> single)
The normalization, as far as I understand, runs the following transformed model for females (f) and males (m):
ln w_f = beta0_f + beta1_f single1_f + beta2 single2_f + other regressors + eps
where the regression is restricted so that beta1_f + beta2_f = 0
ln w_m = beta0_m + beta1_m single1_m + beta2 single2_m + other regressors + eps
where the regression is restricted so that beta1_m + beta2_m = 0
This is done so that the coefficient effect doesn't arbitrarily depend on the base category. Let S_f and S_m denote the percentage of female and male individuals who are single, respectively. I believe the coefficient effect for being single is calculated as
S_f(beta1_m - beta1_f) + S_f(beta2_m - beta2_f)
and the results state that
S_f(beta1_m - beta1_f) = .0698379
S_f(beta2_m - beta2_f) = -.0440028
How do I interpret this? Should I add them together to get the total effect? Is there something meaningful being captured in these separate estimates?
I'm also not clear on what's being calculated for the endowment effect.
By definition, I know that (S_m-S_f) beta1_f = - (S_m -S_f) beta2_f, so clearly it can't be calculating (S_m-S_f) beta1_f + (S_m -S_f) beta2_f as this would always equal zero. But I am getting two separate (and identical) estimates for singleness--what are they? Should I also be adding these endowment effects together to get a total estimate of the change in women's outcome if they had the same distribution of single-ness as men? (this example is for illustrative purposes, so ignore the insignificance).
Maybe this is not how I should be handling dummies?
For my reference I'm using this 2008 document by Ben Jann: https://journals.sagepub.com/doi/pdf...867X0800800401 (as well as the help file for oaxaca)
It's mostly written without the normalization, and the normalization is discussed in a subsection near the end. Thanks in advance!
I'm using the oaxaca command to decompose wage differentials. I'm looking for some clarification on using normalize() for a 0/1 dummy variable. Consider the following example:
Code:
clear all set more off use http://fmwww.bc.edu/RePEc/bocode/o/oaxaca.dta tabulate single, gen(single) nofreq oaxaca lnwage normalize(single1 single2) educ exper tenure , by(female)
Blinder-Oaxaca decomposition Number of obs = 1,434
Model = linear
Group 1: female = 0 N of obs 1 = 751
Group 2: female = 1 N of obs 2 = 683
------------------------------------------------------------------------------
lnwage | Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
overall |
group_1 | 3.440222 .0174928 196.66 0.000 3.405937 3.474507
group_2 | 3.266761 .0218657 149.40 0.000 3.223905 3.309617
difference | .1734607 .028002 6.19 0.000 .1185779 .2283436
endowments | .0833608 .015955 5.22 0.000 .0520897 .114632
coefficients | .1041363 .0256305 4.06 0.000 .0539015 .1543711
interaction | -.0140364 .0125167 -1.12 0.262 -.0385688 .0104959
-------------+----------------------------------------------------------------
endowments |
single1 | -.0001387 .0007305 -0.19 0.849 -.0015705 .0012932
single2 | -.0001387 .0007305 -0.19 0.849 -.0015705 .0012932
educ | .0514044 .0123024 4.18 0.000 .0272921 .0755167
exper | .0243096 .0086225 2.82 0.005 .0074099 .0412093
tenure | .0079242 .0086147 0.92 0.358 -.0089603 .0248086
-------------+----------------------------------------------------------------
coefficients |
single1 | .0698379 .0167273 4.18 0.000 .037053 .1026227
single2 | -.0440028 .0106675 -4.12 0.000 -.0649108 -.0230949
educ | -.154297 .1180917 -1.31 0.191 -.3857526 .0771585
exper | -.0838708 .0409529 -2.05 0.041 -.164137 -.0036046
tenure | .0233765 .0268519 0.87 0.384 -.0292522 .0760053
_cons | .2930927 .1331797 2.20 0.028 .0320653 .55412
-------------+----------------------------------------------------------------
interaction |
single1 | -.0005633 .0029394 -0.19 0.848 -.0063245 .0051979
single2 | -.0005633 .0029394 -0.19 0.848 -.0063245 .0051979
educ | -.0079976 .0063641 -1.26 0.209 -.0204711 .0044759
exper | -.0133994 .0074483 -1.80 0.072 -.0279979 .001199
tenure | .0084872 .0098556 0.86 0.389 -.0108294 .0278037
My question concerns the dummy variable for single (single1 -> not single, single2 -> single)
The normalization, as far as I understand, runs the following transformed model for females (f) and males (m):
ln w_f = beta0_f + beta1_f single1_f + beta2 single2_f + other regressors + eps
where the regression is restricted so that beta1_f + beta2_f = 0
ln w_m = beta0_m + beta1_m single1_m + beta2 single2_m + other regressors + eps
where the regression is restricted so that beta1_m + beta2_m = 0
This is done so that the coefficient effect doesn't arbitrarily depend on the base category. Let S_f and S_m denote the percentage of female and male individuals who are single, respectively. I believe the coefficient effect for being single is calculated as
S_f(beta1_m - beta1_f) + S_f(beta2_m - beta2_f)
and the results state that
S_f(beta1_m - beta1_f) = .0698379
S_f(beta2_m - beta2_f) = -.0440028
How do I interpret this? Should I add them together to get the total effect? Is there something meaningful being captured in these separate estimates?
I'm also not clear on what's being calculated for the endowment effect.
By definition, I know that (S_m-S_f) beta1_f = - (S_m -S_f) beta2_f, so clearly it can't be calculating (S_m-S_f) beta1_f + (S_m -S_f) beta2_f as this would always equal zero. But I am getting two separate (and identical) estimates for singleness--what are they? Should I also be adding these endowment effects together to get a total estimate of the change in women's outcome if they had the same distribution of single-ness as men? (this example is for illustrative purposes, so ignore the insignificance).
Maybe this is not how I should be handling dummies?
For my reference I'm using this 2008 document by Ben Jann: https://journals.sagepub.com/doi/pdf...867X0800800401 (as well as the help file for oaxaca)
It's mostly written without the normalization, and the normalization is discussed in a subsection near the end. Thanks in advance!
