Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Confusing Decomposition output for the explained component using oaxaca and mvdcmp

    Good day,

    I am using decomposing analysis to run the following command on individual level data in Stata version 15.1.

    HTML Code:
    oaxaca move normalize(b.age1624 age2534 age3544 age4554 age55over) normalize(b.male female) normalize(b.ausborn migrant ) normalize(b.married divorced nvmarried widowsep) normalize(b.degree nodegree ) normalize(b.ownerbuyer renters ) normalize(b.notinLF inLF ) normalize(b.noduoinc duoinc), by( svyear) pooled logit
    I have created dummies for all my variables. The summary statistics are OK and as excepted. So are the odds ratios. But the coefficient for the explained component are a bit confusing as you can see. It seems the effects are duplicates for each of the vars that are paired eg, male and female ausborn and foreign, degree and no degree, inLf and noiinLF, and duoinc noduoinc. But not for the variables that are more than 2 categories such as age categories and married, divorced, nvmaried an widowsep.
    explained Coef. Std. Err z P>z [95% Conf. Interval]
    age1624 -0.00011 1.55E-05 -7.14 0 -0.00014 -8E-05
    age2534 -0.00011 1.67E-05 -6.67 0 -0.00014 -7.9E-05
    age3544 -1.3E-05 1.31E-05 -1 0.317 -3.9E-05 1.26E-05
    age4554 -3.21E-07 5.38E-06 -0.06 0.952 -1.1E-05 1.02E-05
    age55over -0.00043 7.22E-05 -5.93 0 -0.00057 -0.00029
    male -4.49E-07 7.25E-07 -0.62 0.536 -1.87E-06 9.73E-07
    female -4.49E-07 7.25E-07 -0.62 0.536 -1.87E-06 9.73E-07
    ausborn -3.2E-05 9.34E-06 -3.44 0.001 -5E-05 -1.4E-05
    migrant -3.2E-05 9.34E-06 -3.44 0.001 -5E-05 -1.4E-05
    married -3.8E-05 2.21E-05 -1.69 0.091 -8.1E-05 5.92E-06
    divorced 1.27E-05 1.26E-05 1.01 0.314 -1.2E-05 3.74E-05
    nvmarried -4E-05 1.74E-05 -2.31 0.021 -7.5E-05 -6.15E-06
    widowsep 3.75E-06 6.95E-06 0.54 0.59 -9.88E-06 1.74E-05
    degree 0.000385 6.17E-05 6.23 0 0.000264 0.000506
    nodegree 0.000385 6.17E-05 6.23 0 0.000264 0.000506
    ownerbuyer 0.000175 0.000026 6.75 0 0.000124 0.000226
    renters 0.000175 0.000026 6.75 0 0.000124 0.000226
    notinLF 1.64E-05 5.91E-06 2.78 0.005 4.86E-06 0.000028
    inLF 1.64E-05 5.91E-06 2.78 0.005 4.86E-06 0.000028
    noduoinc -5.7E-05 9.51E-06 -5.98 0 -7.6E-05 -3.8E-05
    duoinc -5.7E-05 9.51E-06 -5.98 0 -7.6E-05 -3.8E-05

    This doesn't repeat in the unexplained component as you can see
    unexplained Coef. Std. Err z P>z [95% Conf. Interval]
    age1624 -0.00052 0.000567 -0.92 0.356 -0.00164 0.000588
    age2534 -2.7E-05 0.000438 -0.06 0.952 -0.00088 0.000831
    age3544 -0.00016 0.000514 -0.31 0.76 -0.00116 0.00085
    age4554 -0.00075 0.000647 -1.16 0.248 -0.00202 0.00052
    age55over 0.002367 0.001627 1.46 0.146 -0.00082 0.005555
    male -0.00057 0.000667 -0.86 0.389 -0.00188 0.000734
    female 0.000609 0.000707 0.86 0.389 -0.00078 0.001995
    ausborn 6.73E-05 0.001059 0.06 0.949 -0.00201 0.002143
    migrant -1.9E-05 0.000422 -0.04 0.964 -0.00085 0.000808
    married -0.00069 0.001442 -0.48 0.633 -0.00351 0.002137
    divorced -1.42E-06 0.000283 -0.01 0.996 -0.00056 0.000553
    nvmarried -0.00031 0.000854 -0.36 0.72 -0.00198 0.001368
    widowsep 0.000175 0.000335 0.52 0.601 -0.00048 0.000832
    degree 0.000247 0.000263 0.94 0.348 -0.00027 0.000761
    nodegree -0.00143 0.001386 -1.03 0.304 -0.00414 0.001291
    ownerbuyer 0.00335 0.002312 1.45 0.147 -0.00118 0.007881
    renters -0.00131 0.000906 -1.45 0.147 -0.00309 0.000461
    notinLF 0.000959 0.000727 1.32 0.187 -0.00047 0.002383
    inLF -0.00184 0.00139 -1.33 0.185 -0.00457 0.000883
    noduoinc -0.00111 0.001449 -0.76 0.445 -0.00395 0.001733
    duoinc 0.000456 0.000595 0.77 0.443 -0.00071 0.001622



    I have tried using an alternative command
    HTML Code:
    mvdcmp svyear, reverse normal(age1624 age2534 age3544 age4554 age55over| male female| ausborn migrant| divorced married nvmarried widowsep | non_student students| degree nodegree| ownerbuyer renters| inLF notinLF| noduoinc duoinc ): logit move age2534 age3544 age4554 age55over female migrant divorced nvmarried widowsep students degree renters inLF duoinc
    and as you can see, i get similar results.
    inters Coef. Std.err z P>z [95% Conf. Interval] Pct.
    age1624 -0.00011 2.08E-05 -5.48 0 -0.00015 -7.3316E-05 4.279
    age2534 -0.00011 2.25E-05 -4.83 0 -0.00015 -6.4537E-05 4.07
    age3544 -9.52E-06 1.71E-05 -0.56 0.577 -4.3E-05 0.000023915 0.35701
    age4554 -3.64E-07 7.43E-08 -4.9 0 -5.10E-07 -2.19E-07 0.013652
    age55over -0.00038 8.75E-05 -4.33 0 -0.00055 -0.00020723 14.2
    male 3.08E-07 9.05E-07 0.34 0.733 -1.47E-06 2.08E-06 -0.01156
    female 3.08E-07 9.05E-07 0.34 0.733 -1.47E-06 2.08E-06 -0.01156
    ausborn -2.5E-05 1.18E-05 -2.14 0.032 -4.9E-05 -2.15E-06 0.95013
    migrant -2.5E-05 1.18E-05 -2.14 0.032 -4.9E-05 -2.15E-06 0.95013
    divorced 9.71E-06 1.63E-05 0.6 0.551 -2.2E-05 0.000041621 -0.36401
    married -2.3E-05 2.73E-05 -0.83 0.405 -7.6E-05 0.000030706 0.85185
    nvmarried -3.9E-05 2.26E-05 -1.71 0.086 -8.3E-05 5.55E-06 1.4551
    widowsep -4.74E-07 9.15E-06 -0.05 0.959 -1.8E-05 0.000017458 0.01776
    non_student -9.14E-06 2.69E-06 -3.4 0.001 -1.4E-05 -3.88E-06 0.3428
    students -9.14E-06 2.69E-06 -3.4 0.001 -1.4E-05 -3.88E-06 0.3428
    degree 0.000385 9.4E-05 4.09 0 0.0002 0.00056893 -14.423
    nodegree 0.000385 9.4E-05 4.09 0 0.0002 0.00056893 -14.423
    ownerbuyer 0.000152 2.9E-05 5.25 0 9.54E-05 0.00020929 -5.7122
    renters 0.000152 2.9E-05 5.25 0 9.54E-05 0.00020929 -5.7122
    inLF 1.62E-06 7.81E-06 0.21 0.835 -1.4E-05 0.000016934 -0.06089
    notinLF 1.62E-06 7.81E-06 0.21 0.835 -1.4E-05 0.000016934 -0.06089
    noduoinc -5E-05 1.08E-05 -4.6 0 -7.1E-05 -2.8506E-05 1.8609
    duoinc -5E-05 1.08E-05 -4.6 0 -7.1E-05 -2.8506E-05 1.8609
    I have tried separating out the variables to including more categories but then the model becomes insignificant.

    What could i possibly be doing wrong?

    Please help.
    Sunganani Kalemba
    PhD Student.
    Queensland

  • #2
    Hi Sunganani,
    I think there is nothing you are doing wrong perse. I replicated the issue you are having even with the simplest of the examples:
    Code:
     use http://fmwww.bc.edu/RePEc/bocode/o/oaxaca.dta, clear
    gen notmarried=1-married
    
    .  oaxaca lnwage educ exper tenure normalize(married notmarried), by(female) w(0) first relax
    (normalized: married notmarried)
    
    Blinder-Oaxaca decomposition                    Number of obs     =      1,434
                                                      Model           =     linear
    Group 1: female = 0                               N of obs 1      =        751
    Group 2: female = 1                               N of obs 2      =        683
    
    ------------------------------------------------------------------------------
          lnwage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    overall      |
         group_1 |   3.440222   .0174934   196.66   0.000     3.405936    3.474508
         group_2 |   3.266761   .0218655   149.40   0.000     3.223906    3.309617
      difference |   .1734607   .0280021     6.19   0.000     .1185776    .2283439
       explained |   .0771106   .0160709     4.80   0.000     .0456122    .1086089
     unexplained |   .0963502    .027173     3.55   0.000     .0430921    .1496083
    -------------+----------------------------------------------------------------
    explained    |
            educ |   .0514117   .0122975     4.18   0.000     .0273092    .0755143
           exper |   .0257804   .0088749     2.90   0.004     .0083859    .0431749
          tenure |   .0098732   .0086464     1.14   0.254    -.0070735    .0268198
         married |  -.0049773   .0025346    -1.96   0.050     -.009945   -9.70e-06
      notmarried |  -.0049773   .0025346    -1.96   0.050     -.009945   -9.70e-06
    -------------+----------------------------------------------------------------
    unexplained  |
            educ |  -.1471334   .1242443    -1.18   0.236    -.3906478    .0963811
           exper |  -.0775717   .0467549    -1.66   0.097    -.1692096    .0140661
          tenure |   .0305542   .0365443     0.84   0.403    -.0410712    .1021797
         married |   .0844044   .0136654     6.18   0.000     .0576208    .1111881
      notmarried |   -.077299   .0125709    -6.15   0.000    -.1019375   -.0526604
           _cons |   .2833955   .1334567     2.12   0.034     .0218253    .5449658
    ------------------------------------------------------------------------------
    I think the reason for the results is related to how exactly the normalization is being done.
    Just to back up a bit. What you are facing here is a problem of dummy trap. In order to identify the detailed effect of all covariates in the analysis, one needs some assumptions (constraints) on the estimated coefficients. Knowing these coefficient constrains will give you the clue of why the results are puzzling.
    That simply means that you may need to go back to Ben Jann's oaxaca paper, and the reference there in, to figure out how exactly the normalization is being done, if you want to explain why the results are as they are.
    If knowing the blackbox of the methodology is beyond of what you want do do, then simply provide the aggregate values for the explained component. after all, you cannot explain how the outcome will change when (in the example above) the proportion of married increases keeping unmarried as constant, because both variables depend on each other.

    Hope this helps.
    Fernando

    Comment


    • #3
      Thank you Fernando for the very important insight.
      Sunganani Kalemba
      PhD Student.
      Queensland

      Comment

      Working...
      X