Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How are p-values calculated in an Oaxaca-Blinder Decomposition

    Hi all,

    I am trying to interpret the results of an Oaxaca-Blinder Decomposition. I am using the popular Oaxaca command. For my project I am just interested in the results of a two-way decomposition (i.e. I don't want the interaction terms), so I am using the pooled option. These results are based off of a logistic regression. Here is my output:

    Code:
    .
    svyset [pweight = weight], str(stratum_var) psu(cluster_var)
    
    tab race, gen(race)
    tab educ, gen(educ)
    tab hhinc, gen(hhinc)
    tab age_g5, gen(age_g5
    
    oaxaca cohab_par age_g52 age_g53 age_g54 age_g55 race2 race3 race4 imm ///
    >          educ2 educ3 educ4 hhinc2 hhinc3 hhinc4, ///
    >          by(rural) svy logit pooled
    
    Blinder-Oaxaca decomposition
    
    Number of strata = 18                             Number of obs   =      3,268
    Number of PSUs   = 72                             Population size = 38,244,320
    Design df       =         54
    Model              =     logit
    Group 1: rural = 0                              N of obs 1         =      2116
    Group 2: rural = 1                              N of obs 2         =       468
    
    
    Linearized
    cohab_par  Coefficient  std. err.      t    P>t     [95% conf. interval]
    
    overall      
    group_1    .1405385   .0141326     9.94   0.000     .1122043    .1688727
    group_2    .2199063   .0242979     9.05   0.000     .1711919    .2686207
    difference   -.0793678   .0276199    -2.87   0.006    -.1347424   -.0239933
    explained   -.0197445   .0107814    -1.83   0.073    -.0413599    .0018708
    unexplained   -.0596233   .0273862    -2.18   0.034    -.1145294   -.0047172
    
    explained    
    age_g52   -.0005991      .0012    -0.50   0.620    -.0030048    .0018067
    age_g53       -.001    .001295    -0.77   0.443    -.0035962    .0015963
    age_g54    .0000969   .0009533     0.10   0.919    -.0018143    .0020081
    age_g55   -.0004468   .0010476    -0.43   0.671    -.0025472    .0016535
    race2    -.001195   .0013982    -0.85   0.396    -.0039982    .0016081
    race3    .0009479     .00266     0.36   0.723    -.0043851    .0062809
    race4    .0005973   .0013599     0.44   0.662     -.002129    .0033237
    imm    .0003432   .0019318     0.18   0.860    -.0035298    .0042161
    educ2   -.0002301   .0009301    -0.25   0.806    -.0020949    .0016347
    educ3    .0008762   .0017989     0.49   0.628    -.0027304    .0044828
    educ4    -.012033    .005119    -2.35   0.022     -.022296     -.00177
    hhinc2   -.0003015   .0006636    -0.45   0.651    -.0016318    .0010289
    hhinc3   -.0001882   .0006291    -0.30   0.766    -.0014493     .001073
    hhinc4   -.0066125   .0039993    -1.65   0.104    -.0146306    .0014057
    
    unexplained  
    age_g52    -.028689   .0165345    -1.74   0.088    -.0618386    .0044607
    age_g53   -.0297615   .0290941    -1.02   0.311    -.0880916    .0285686
    age_g54   -.0556364   .0411148    -1.35   0.182    -.1380666    .0267938
    age_g55   -.0026016   .0261534    -0.10   0.921     -.055036    .0498328
    race2    .0048568   .0077741     0.62   0.535    -.0107293    .0204428
    race3    .0176802   .0161501     1.09   0.278    -.0146989    .0500593
    race4    .0041202   .0058635     0.70   0.485    -.0076355    .0158759
    imm   -.0093887    .012094    -0.78   0.441    -.0336357    .0148583
    educ2    .0047812   .0141557     0.34   0.737    -.0235994    .0331617
    educ3   -.0291882    .026426    -1.10   0.274    -.0821691    .0237926
    educ4   -.0366389   .0235373    -1.56   0.125    -.0838284    .0105505
    hhinc2    .0017072   .0118416     0.14   0.886    -.0220339    .0254482
    hhinc3    .0011674   .0093851     0.12   0.901    -.0176486    .0199834
    hhinc4   -.0020982   .0118124    -0.18   0.860    -.0257806    .0215842
    _cons    .1000664    .098881     1.01   0.316    -.0981782    .2983109
    I am seeking help since within these results, there is overall significant differences between the two groups (urban and rural in this case), with roughly 24.9% of the difference coming from difference in composition/explained (p-value not statistically significant) and 75.1 percent coming from differences in coefficients/unexplained (p-value is statistically significant). On the surface this make sense, but when you look at the individual variables within the explained and unexplained portions, these findings don't line up.

    Specifically, the only significant variables are found in the explained portion (educ4), and there are no significant variables in the unexplained portion (despite unexplained being significant overall).

    Am I misinterpreting the results? Or, on a more technical level, how are standard errors and p-values calculated within an Oaxaca-Blinder Decomposition? Does the calculation differ when trying to estimate the significance of the overall explained/unexplained components than when trying to calculate the effects of individual variables?

    Please let me know if I can clarify anything

  • #2
    May I know how do you get 24.9% and 75.1%?

    Comment


    • #3
      You are decomposing the difference. Part of it is explained and the remainder is unexplained. So it is the ratio of the explained/unexplained coefficient and the difference coefficient.

      Code:
      di (-.0197445/-.0793678)*100
      di (-.0596233/-.0793678)*100
      Res.:

      Code:
      . di (-.0197445/-.0793678)*100
      24.877217
      
      . 
      . di (-.0596233/-.0793678)*100
      75.122783

      Comment

      Working...
      X