Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Obtaining pooled OLS estimates

    I'm trying to compare Oaxaca-Blinder decomposition results to OLS estimates regarding the male wage gap between work-limited disabled (DISTYPE =1) and non-disabled (DISTYPE = 3) (to note: DISTYPE is a categorical not a dummy variable).

    I want to obtain results such that I have pooled, quarter 1 and quarter 5 OLS estimates for both DISTYPE = 1 and DISTYPE = 4. My dataset is 5 quarter and longitudinal, with variables ending in e.g 5 to represent they were a quarter 5 variable. Thus I had to 'reshape long'.

    I have ran quarter 1 and quarter 5 estimates (shown below only for DISTYPE = 1):
    Code:
     regress logGRSSWK WHITE i.AGE i.RESIDENCE i.INDUSTRY i.EDUCATION i.WORKREGION
    i.JOBTENURE if DISTYPE == 1 & quarter == 1
    
    regress logGRSSWK WHITE i.AGE i.RESIDENCE i.INDUSTRY i.EDUCATION
    i.WORKREGION i.JOBTENURE if DISTYPE == 1 & quarter == 5
    However, I am struggling to understand what 'pooled' would relate to here? This may be a foolish question but any help would be greatly appreciated

  • #2
    Hi Will,
    The information you need is on the "oaxaca" help file
    pooled (model opts) computes the twofold decomposition by using the coefficients from a pooled model over both groups as the reference coefficients. groupvar is included in the pooled model as an additional control variable. Estimation details can be specified in parentheses; see the model1() option below.
    Look into this example:

    Code:
    use http://fmwww.bc.edu/RePEc/bocode/o/oaxaca.dta
      oaxaca lnwage educ exper tenure, by(female) pool noisily
    this is what it gives as "pooled" regression
    Code:
    Pooled model
    
          Source |       SS           df       MS      Number of obs   =     1,434
    -------------+----------------------------------   F(4, 1429)      =    107.19
           Model |  93.2691812         4  23.3172953   Prob > F        =    0.0000
        Residual |  310.850622     1,429  .217530177   R-squared       =    0.2308
    -------------+----------------------------------   Adj R-squared   =    0.2286
           Total |  404.119804     1,433  .282009633   Root MSE        =     .4664
    
    ------------------------------------------------------------------------------
          lnwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
            educ |   .0847507   .0051878    16.34   0.000     .0745742    .0949272
           exper |   .0110983   .0015372     7.22   0.000     .0080828    .0141138
          tenure |   .0077084   .0018797     4.10   0.000     .0040211    .0113958
          female |  -.0841137   .0251455    -3.35   0.001    -.1334398   -.0347876
           _cons |   2.213327   .0683455    32.38   0.000     2.079259    2.347395
    ------------------------------------------------------------------------------
    which you can replicate following their own description:
    Code:
    reg lnwage educ exper tenure female
    HTH

    Comment


    • #3
      Originally posted by FernandoRios View Post
      Hi Will,
      The information you need is on the "oaxaca" help file


      Look into this example:

      Code:
      use http://fmwww.bc.edu/RePEc/bocode/o/oaxaca.dta
      oaxaca lnwage educ exper tenure, by(female) pool noisily
      this is what it gives as "pooled" regression
      Code:
      Pooled model
      
      Source | SS df MS Number of obs = 1,434
      -------------+---------------------------------- F(4, 1429) = 107.19
      Model | 93.2691812 4 23.3172953 Prob > F = 0.0000
      Residual | 310.850622 1,429 .217530177 R-squared = 0.2308
      -------------+---------------------------------- Adj R-squared = 0.2286
      Total | 404.119804 1,433 .282009633 Root MSE = .4664
      
      ------------------------------------------------------------------------------
      lnwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      educ | .0847507 .0051878 16.34 0.000 .0745742 .0949272
      exper | .0110983 .0015372 7.22 0.000 .0080828 .0141138
      tenure | .0077084 .0018797 4.10 0.000 .0040211 .0113958
      female | -.0841137 .0251455 -3.35 0.001 -.1334398 -.0347876
      _cons | 2.213327 .0683455 32.38 0.000 2.079259 2.347395
      ------------------------------------------------------------------------------
      which you can replicate following their own description:
      Code:
      reg lnwage educ exper tenure female
      HTH
      Thank you for the help Fernando. Could I just clarify that in my context whereby I only have males in my dataset and wish to find OLS estimates 'pooled' 'Q1' and 'Q5' of DISTYPE = 1 (and then do the same for DISTYPE = 4), that my pooled here would simply be over both quarters, i.e:

      Code:
       regress logGRSSK WHITE i.AGE i.RESIDENCE... if DISTYPE == 1
      ? Because I initially thought it would be with both DISTYPE = 1 & 4, however if I was to also carry out the same task for the latter then I don't believe this could not be the case. Many thanks and apologies if my thinking is illogical.
      Last edited by Will Murphy; 06 Apr 2020, 15:35.

      Comment


      • #4
        I dont understand.
        can you show the tab between distype and quarter?

        Comment


        • #5
          Originally posted by FernandoRios View Post
          I dont understand.
          can you show the tab between distype and quarter?
          Code:
           tab quarter DISTYPE
          
                     |         RECODE of DISEXT
             quarter |       WLD       DALD  Non-disab |     Total
          -----------+---------------------------------+----------
                   1 |       299        139        919 |     1,357
                   5 |       302         90        755 |     1,147
          -----------+---------------------------------+----------
               Total |       601        229      1,674 |     2,504
          My dependent variable of logGRSSWK was only available for quarters 1 and 5. For clarity I would like to obtain results to enable me to fill in this for OLS regressions such as the 2 I posted in #1.
          DISTYPE == 1 DISTYPE == 4
          Pooled Quarter 1 Quarter 5 Pooled Quarter 1 Quarter 5
          Many thanks.
          Last edited by Will Murphy; 06 Apr 2020, 16:23.

          Comment


          • #6
            This?
            Code:
            regress logGRSSK WHITE i.AGE i.RESIDENCE... if DISTYPE == 1 & (quarter==1 | quarter==5)
            regress logGRSSK WHITE i.AGE i.RESIDENCE... if DISTYPE == 1 & quarter==1
            regress logGRSSK WHITE i.AGE i.RESIDENCE... if DISTYPE == 1 & quarter==5
            regress logGRSSK WHITE i.AGE i.RESIDENCE... if DISTYPE == 4 & (quarter==1 | quarter==5)
            regress logGRSSK WHITE i.AGE i.RESIDENCE... if DISTYPE == 4 & quarter==1
            regress logGRSSK WHITE i.AGE i.RESIDENCE... if DISTYPE == 4 & quarter==5

            Comment


            • #7
              Originally posted by FernandoRios View Post
              This?
              Code:
              regress logGRSSK WHITE i.AGE i.RESIDENCE... if DISTYPE == 1 & (quarter==1 | quarter==5)
              regress logGRSSK WHITE i.AGE i.RESIDENCE... if DISTYPE == 1 & quarter==1
              regress logGRSSK WHITE i.AGE i.RESIDENCE... if DISTYPE == 1 & quarter==5
              regress logGRSSK WHITE i.AGE i.RESIDENCE... if DISTYPE == 4 & (quarter==1 | quarter==5)
              regress logGRSSK WHITE i.AGE i.RESIDENCE... if DISTYPE == 4 & quarter==1
              regress logGRSSK WHITE i.AGE i.RESIDENCE... if DISTYPE == 4 & quarter==5
              Thank you Fernando. Apologies I should have clarified that, because logGRSSWK (my dependent variable) has missing values for quarters 2,3 and 4, I only reshaped quarter 1 and 5 variables for all variables in my equations and dropped the rest- therefore I believe the (quarter==1 | quarter==5) in #6 would be equivalent to the one in #3.

              If I may ask just one more thing regarding this, I know for both OLS and oaxaca decompositions of Q1 vs Q5 comparisons (regarding legislation effectiveness) this means the results are thus dependent on strong assumptions of e.g wage rigidity in quarters 2,3 and 4.

              I am already using heckman for my Oaxaca decomposition of Q1 vs Q5 comparison, is it possible for me to use heckman on all observations for quarters 2 3 and 4, and compare the full 5 quarters of my dataset? Or do I have to have some values observed in these quarters to assess the wage gap decomposition sufficiently?
              Last edited by Will Murphy; 07 Apr 2020, 01:47.

              Comment

              Working...
              X