Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • time-specific events in a fixed effect model

    I working on a model that has different between and within effects, and each are of interest. So, I am utilizing a 'hybrid' model which explicitly includes both the mean and demeaned independent variables into a random effect model with the unaltered dependent variable. With my panel data I have some variables that are unit specific (industry a firm is in), thus no within variation and others that are time specific (recessionary quarters), thus no between unit variation. In this context, I want to understand how these time-specific events moderate an independent variable's association with the dependent variable. Said differently, are the within-effects of an independent variable different for these time periods compared to other periods in time. Using a standard fixed-effect model with this type of interaction does not appear to be answering this specific question, as I will highlight below.
    I will use the Stata nlswork.dat data for ease of exposition and reproduction of my question.

    First, let me code two years of the data to have some condition, such that I want to see if the effect from an independent variable is different for these periods in time. Then I will run a fixed effect regression looking at how the interaction of this variable with hours worked is associated with the wage; i.e., is there any difference in the relationship between hours and wage for these periods in time compared to the other periods in time. First only the main effects then with the interactions in a second model.

    Code:
    .
    . use https://www.stata-press.com/data/r18/nlswork.dta
    (National Longitudinal Survey of Young Women, 14-24 years old in 1968)
    
    .gen cond=0
    
    
    . replace cond=1 if  (year==82| year==83)
    (4,072 real changes made)
    
    
    . xtreg ln_wage hours cond, fe
    
    Fixed-effects (within) regression               Number of obs     =     28,467
    Group variable: idcode                          Number of groups  =      4,710
    
    R-squared:                                      Obs per group:
         Within  = 0.0029                                         min =          1
         Between = 0.0293                                         avg =        6.0
         Overall = 0.0073                                         max =         15
    
                                                    F(2,23755)        =      34.63
    corr(u_i, Xb) = 0.0630                          Prob > F          =     0.0000
    
    ------------------------------------------------------------------------------
         ln_wage | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
           hours |   .0005324   .0002524     2.11   0.035     .0000378    .0010271
            cond |   .0472563   .0058123     8.13   0.000     .0358638    .0586488
           _cons |   1.649113   .0094893   173.79   0.000     1.630513    1.667712
    -------------+----------------------------------------------------------------
         sigma_u |  .42210011
         sigma_e |  .31996525
             rho |  .63507708   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    F test that all u_i=0: F(4709, 23755) = 8.28                 Prob > F = 0.0000
    
    
    
    
    . xtreg ln_wage hours cond c.hours#c.cond, fe
    
    Fixed-effects (within) regression               Number of obs     =     28,467
    Group variable: idcode                          Number of groups  =      4,710
    
    R-squared:                                      Obs per group:
         Within  = 0.0031                                         min =          1
         Between = 0.0191                                         avg =        6.0
         Overall = 0.0063                                         max =         15
    
                                                    F(3,23754)        =      24.90
    corr(u_i, Xb) = 0.0531                          Prob > F          =     0.0000
    
    --------------------------------------------------------------------------------
           ln_wage | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    ---------------+----------------------------------------------------------------
             hours |   .0007287   .0002661     2.74   0.006     .0002072    .0012502
              cond |   .0971296   .0221872     4.38   0.000     .0536413    .1406178
                   |
    c.hours#c.cond |   -.001381   .0005929    -2.33   0.020    -.0025431   -.0002188
                   |
             _cons |   1.641887   .0099827   164.47   0.000      1.62232    1.661453
    ---------------+----------------------------------------------------------------
           sigma_u |  .42228407
           sigma_e |  .31993546
               rho |  .63532217   (fraction of variance due to u_i)
    --------------------------------------------------------------------------------
    F test that all u_i=0: F(4709, 23754) = 8.29                 Prob > F = 0.0000
    Following Schunck (2013), I will manually demean these variables before the interaction to see if I can replicate the above results, thus confirming they are not giving me the answer to the initial question. I will run a demeaned regression as well as a hybrid regression to verify how the fixed-effects model accounted for the interaction effect.

    Code:
    . by idcode: egen m_hours =mean(hours)
    (1 missing value generated)
    
    . by idcode: egen m_cond =mean(cond)
    
    . by idcode: egen m_ln_wage =mean(ln_wage)
    
    . gen dm_hours = hours - m_hours
    (67 missing values generated)
    
    . gen dm_cond = cond - m_cond
    
    . gen dm_ln_wage = ln_wage - m_ln_wage
    
    . gen hoursXcond = hours * cond
    (67 missing values generated)
    
    . by idcode: egen m_hoursXcond =mean(hoursXcond)
    (1 missing value generated)
    
    . gen dm_hoursXcond = hoursXcond - m_hoursXcond
    (67 missing values generated)
    First the main effects, which are consistent with the xtreg,fe results.

    Code:
    . reg dm_ln_wage dm_hours dm_cond, nocons
    
          Source |       SS           df       MS      Number of obs   =    28,467
    -------------+----------------------------------   F(2, 28465)     =     41.34
           Model |  7.06578076         2  3.53289038   Prob > F        =    0.0000
        Residual |  2432.78881    28,465  .085465969   R-squared       =    0.0029
    -------------+----------------------------------   Adj R-squared   =    0.0028
           Total |  2439.85459    28,467  .085708174   Root MSE        =    .29235
    
    ------------------------------------------------------------------------------
      dm_ln_wage | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
        dm_hours |   .0005323   .0002306     2.31   0.021     .0000803    .0009842
         dm_cond |   .0471569   .0053088     8.88   0.000     .0367513    .0575625
    ------------------------------------------------------------------------------
    
    
    . xtreg ln_wage dm_hours dm_cond, re
    
    Random-effects GLS regression                   Number of obs     =     28,467
    Group variable: idcode                          Number of groups  =      4,710
    
    R-squared:                                      Obs per group:
         Within  = 0.0029                                         min =          1
         Between = 0.0001                                         avg =        6.0
         Overall = 0.0011                                         max =         15
    
                                                    Wald chi2(2)      =      69.45
    corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000
    
    ------------------------------------------------------------------------------
         ln_wage | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
        dm_hours |   .0005324   .0002518     2.11   0.035     .0000388     .001026
         dm_cond |   .0472209   .0057998     8.14   0.000     .0358534    .0585884
           _cons |   1.656057   .0060933   271.78   0.000     1.644115       1.668
    -------------+----------------------------------------------------------------
         sigma_u |  .38532981
         sigma_e |  .31996525
             rho |  .59188767   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    
    . xtreg ln_wage dm_hours dm_cond m_hours m_cond , re
    
    Random-effects GLS regression                   Number of obs     =     28,467
    Group variable: idcode                          Number of groups  =      4,710
    
    R-squared:                                      Obs per group:
         Within  = 0.0029                                         min =          1
         Between = 0.0420                                         avg =        6.0
         Overall = 0.0244                                         max =         15
    
                                                    Wald chi2(4)      =     281.59
    corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000
    
    ------------------------------------------------------------------------------
         ln_wage | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
        dm_hours |   .0005324   .0002518     2.11   0.034     .0000389    .0010259
         dm_cond |   .0472306   .0057988     8.14   0.000     .0358652    .0585959
         m_hours |   .0099165    .000795    12.47   0.000     .0083583    .0114746
          m_cond |   .2925363   .0376106     7.78   0.000     .2188209    .3662518
           _cons |   1.258369   .0299296    42.04   0.000     1.199708     1.31703
    -------------+----------------------------------------------------------------
         sigma_u |  .37552855
         sigma_e |  .31996525
             rho |  .57938376   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    Now with the interaction effects included, which yields very similar results, but clearly not how the fixed effects may be different for these two years compared to the rest.

    Code:
    . reg dm_ln_wage dm_hours dm_cond dm_hoursXcond , nocons
    
          Source |       SS           df       MS      Number of obs   =    28,467
    -------------+----------------------------------   F(3, 28464)     =     29.60
           Model |  7.58681931         3  2.52893977   Prob > F        =    0.0000
        Residual |  2432.26777    28,464  .085450666   R-squared       =    0.0031
    -------------+----------------------------------   Adj R-squared   =    0.0030
           Total |  2439.85459    28,467  .085708174   Root MSE        =    .29232
    
    -------------------------------------------------------------------------------
       dm_ln_wage | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    --------------+----------------------------------------------------------------
         dm_hours |   .0007215    .000243     2.97   0.003     .0002452    .0011977
          dm_cond |   .0952231   .0201762     4.72   0.000     .0556768    .1347695
    dm_hoursXcond |  -.0013318   .0005393    -2.47   0.014     -.002389   -.0002747
    -------------------------------------------------------------------------------
    
    . xtreg ln_wage dm_hours dm_cond dm_hoursXcond m_hours m_cond m_hoursXcond , re
    
    Random-effects GLS regression                   Number of obs     =     28,467
    Group variable: idcode                          Number of groups  =      4,710
    
    R-squared:                                      Obs per group:
         Within  = 0.0031                                         min =          1
         Between = 0.0457                                         avg =        6.0
         Overall = 0.0256                                         max =         15
    
                                                    Wald chi2(6)      =     302.07
    corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000
    
    -------------------------------------------------------------------------------
          ln_wage | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
    --------------+----------------------------------------------------------------
         dm_hours |   .0007253   .0002655     2.73   0.006     .0002051    .0012456
          dm_cond |   .0962418   .0221264     4.35   0.000     .0528749    .1396087
    dm_hoursXcond |  -.0013581   .0005913    -2.30   0.022     -.002517   -.0001992
          m_hours |   .0082031   .0009138     8.98   0.000      .006412    .0099941
           m_cond |  -.0703764   .1031531    -0.68   0.495    -.2725527    .1317999
     m_hoursXcond |   .0106746   .0028241     3.78   0.000     .0051394    .0162097
            _cons |   1.318755   .0338768    38.93   0.000     1.252358    1.385152
    --------------+----------------------------------------------------------------
          sigma_u |  .37463591
          sigma_e |  .31993546
              rho |  .57826882   (fraction of variance due to u_i)
    -------------------------------------------------------------------------------
    
    .
    How can I instead investigate the differential effect of an independent variable based on different periods in time that are measured with a binary indicator? For the above example, how years 1982 and 1983 (cond=1) may have had differential effects from hours on wages compared to all other years?

  • #2
    Thinking more about this, I should have kept only the data from the original xtreg such that the demeaned variables would only include those variables from the same sample. Once the samples are identical, so are the coefficients from a manually demeaned regression to the fixed effect panel regression.

    Code:
    . xtreg ln_wage hours , fe
    
    Fixed-effects (within) regression               Number of obs     =     28,467
    Group variable: idcode                          Number of groups  =      4,710
    
    R-squared:                                      Obs per group:
         Within  = 0.0001                                         min =          1
         Between = 0.0314                                         avg =        6.0
         Overall = 0.0074                                         max =         15
    
                                                    F(1,23756)        =       3.14
    corr(u_i, Xb) = 0.0976                          Prob > F          =     0.0764
    
    ------------------------------------------------------------------------------
         ln_wage | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
           hours |   .0004474   .0002525     1.77   0.076    -.0000475    .0009423
           _cons |   1.658941   .0094249   176.02   0.000     1.640468    1.677415
    -------------+----------------------------------------------------------------
         sigma_u |   .4229084
         sigma_e |  .32040339
             rho |  .63532952   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    F test that all u_i=0: F(4709, 23756) = 8.30                 Prob > F = 0.0000
    
    . reg dm_ln_wage dm_hours  , nocons
    
          Source |       SS           df       MS      Number of obs   =    28,467
    -------------+----------------------------------   F(1, 28466)     =      3.76
           Model |  .322304231         1  .322304231   Prob > F        =    0.0524
        Residual |  2438.75131    28,466  .085672427   R-squared       =    0.0001
    -------------+----------------------------------   Adj R-squared   =    0.0001
           Total |  2439.07362    28,467   .08568074   Root MSE        =     .2927
    
    ------------------------------------------------------------------------------
      dm_ln_wage | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
        dm_hours |   .0004474   .0002307     1.94   0.052    -4.72e-06    .0008995
    ------------------------------------------------------------------------------
    Then, I can include the time-specific indicator into the demeaned regression.

    Code:
    
    . reg dm_ln_wage c.dm_hours##c.cond, nocons
    
          Source |       SS           df       MS      Number of obs   =    28,467
    -------------+----------------------------------   F(3, 28464)     =     25.59
           Model |  6.56150276         3  2.18716759   Prob > F        =    0.0000
        Residual |  2432.51211    28,464  .085459251   R-squared       =    0.0027
    -------------+----------------------------------   Adj R-squared   =    0.0026
           Total |  2439.07362    28,467   .08568074   Root MSE        =    .29233
    
    -----------------------------------------------------------------------------------
           dm_ln_wage | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    ------------------+----------------------------------------------------------------
             dm_hours |   .0008318   .0002463     3.38   0.001      .000349    .0013147
                 cond |   .0338046   .0046185     7.32   0.000     .0247521    .0428571
                      |
    c.dm_hours#c.cond |  -.0025824   .0006991    -3.69   0.000    -.0039526   -.0012123
    -----------------------------------------------------------------------------------
    This indicates that during the time periods specified by 'cond', wages were higher, but the within-effects from hours worked are reduced; i.e., working more hours during these periods is associated with lower wages, or working less hours is associated with higher wages, compared to the averages. But, it is well documented that manually demeaning in this fashion yields incorrect standard errors compared to a true fixed-effects model.

    Is there any way to correct the standard errors for a manually demeaned regression to be consistent with those from a fixed-effect panel model?

    Comment


    • #3
      Never mind, figured it out. xtreg has fewer df since it explicitly accounts for each entity. Corrected se = se * sqrt(n-1)/sqrt(n-1-i) where n = number of observations and i = number of individual entities for the fixed effect.

      Comment

      Working...
      X