Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • oaxaca: normalized dummy variables

    Hi all,

    I'm using the oaxaca command to decompose wage differentials. I'm looking for some clarification on using normalize() for a 0/1 dummy variable. Consider the following example:

    Code:
    clear all
    set more off
    
    use http://fmwww.bc.edu/RePEc/bocode/o/oaxaca.dta
    
    
    tabulate single, gen(single) nofreq
    
    
    oaxaca lnwage normalize(single1 single2)  educ exper tenure , by(female)
    Here's the output:


    Blinder-Oaxaca decomposition Number of obs = 1,434
    Model = linear
    Group 1: female = 0 N of obs 1 = 751
    Group 2: female = 1 N of obs 2 = 683

    ------------------------------------------------------------------------------
    lnwage | Coefficient Std. err. z P>|z| [95% conf. interval]
    -------------+----------------------------------------------------------------
    overall |
    group_1 | 3.440222 .0174928 196.66 0.000 3.405937 3.474507
    group_2 | 3.266761 .0218657 149.40 0.000 3.223905 3.309617
    difference | .1734607 .028002 6.19 0.000 .1185779 .2283436
    endowments | .0833608 .015955 5.22 0.000 .0520897 .114632
    coefficients | .1041363 .0256305 4.06 0.000 .0539015 .1543711
    interaction | -.0140364 .0125167 -1.12 0.262 -.0385688 .0104959
    -------------+----------------------------------------------------------------
    endowments |
    single1 | -.0001387 .0007305 -0.19 0.849 -.0015705 .0012932
    single2 | -.0001387 .0007305 -0.19 0.849 -.0015705 .0012932
    educ | .0514044 .0123024 4.18 0.000 .0272921 .0755167
    exper | .0243096 .0086225 2.82 0.005 .0074099 .0412093
    tenure | .0079242 .0086147 0.92 0.358 -.0089603 .0248086
    -------------+----------------------------------------------------------------
    coefficients |
    single1 | .0698379 .0167273 4.18 0.000 .037053 .1026227
    single2 | -.0440028 .0106675 -4.12 0.000 -.0649108 -.0230949
    educ | -.154297 .1180917 -1.31 0.191 -.3857526 .0771585
    exper | -.0838708 .0409529 -2.05 0.041 -.164137 -.0036046
    tenure | .0233765 .0268519 0.87 0.384 -.0292522 .0760053
    _cons | .2930927 .1331797 2.20 0.028 .0320653 .55412
    -------------+----------------------------------------------------------------
    interaction |
    single1 | -.0005633 .0029394 -0.19 0.848 -.0063245 .0051979
    single2 | -.0005633 .0029394 -0.19 0.848 -.0063245 .0051979
    educ | -.0079976 .0063641 -1.26 0.209 -.0204711 .0044759
    exper | -.0133994 .0074483 -1.80 0.072 -.0279979 .001199
    tenure | .0084872 .0098556 0.86 0.389 -.0108294 .0278037



    My question concerns the dummy variable for single (single1 -> not single, single2 -> single)

    The normalization, as far as I understand, runs the following transformed model for females (f) and males (m):

    ln w_f = beta0_f + beta1_f single1_f + beta2 single2_f + other regressors + eps
    where the regression is restricted so that beta1_f + beta2_f = 0

    ln w_m = beta0_m + beta1_m single1_m + beta2 single2_m + other regressors + eps
    where the regression is restricted so that beta1_m + beta2_m = 0

    This is done so that the coefficient effect doesn't arbitrarily depend on the base category. Let S_f and S_m denote the percentage of female and male individuals who are single, respectively. I believe the coefficient effect for being single is calculated as

    S_f(beta1_m - beta1_f) + S_f(beta2_m - beta2_f)

    and the results state that
    S_f(beta1_m - beta1_f) = .0698379
    S_f(beta2_m - beta2_f) = -.0440028

    How do I interpret this? Should I add them together to get the total effect? Is there something meaningful being captured in these separate estimates?


    I'm also not clear on what's being calculated for the endowment effect.
    By definition, I know that (S_m-S_f) beta1_f = - (S_m -S_f) beta2_f, so clearly it can't be calculating (S_m-S_f) beta1_f + (S_m -S_f) beta2_f as this would always equal zero. But I am getting two separate (and identical) estimates for singleness--what are they? Should I also be adding these endowment effects together to get a total estimate of the change in women's outcome if they had the same distribution of single-ness as men? (this example is for illustrative purposes, so ignore the insignificance).


    Maybe this is not how I should be handling dummies?

    For my reference I'm using this 2008 document by Ben Jann: https://journals.sagepub.com/doi/pdf...867X0800800401 (as well as the help file for oaxaca)

    It's mostly written without the normalization, and the normalization is discussed in a subsection near the end. Thanks in advance!
Working...
X