Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • AGE variable ommitted in fixed effect regression with redhdfe but not with xtreg, fe

    Hello Statalist,

    I have a question regarding the ommission of a variable in my regression with panel data and year+firm fixed effects.

    It is an AGE variable that aims to capture how old a firm is. For example, if the firm is listed in 2019 for the first time, the AGE variable is set to 0, in 2020 it is then set to one and so forth.

    There are several papers which include this age variable, and i found that it is left out in my analysis, depending on which code I use.

    Code:
    reghdfe Y1 x1 x2 x3 AGE c2 c3 c4 c5 c6 c7 c8 c9, absorb(FIRM Year) cluster(BUSINESSECTOR)
      AGE |          0  (omitted)
    
    reghdfe Y1 x1 x2 x3 AGE_w c2 c3 c4 c5 c6 c7 c8 c9, absorb(FIRM Year) cluster(BUSINESSECTOR)
     AGE_w |     .10504   .0636969     1.65   0.112    -.0261462    .2362262
    
    xtreg  Y1 x1 x2 x3 AGE c2 c3 c4 c5 c6 c7 c8 c9 i.Year, fe vce(cluster BUSINESSECTOR)
     AGE |   .0701199   .0075331     9.31   0.000     .0546052    .0856346

    For the second regression I winsorize the variable. 5 out of 1000 values are changed. Concretely, when the AGE variable is zero, i.e. first time listed.
    How is it then that the xtreg does not leave out the AGE variable, but the reghdfe does? I also tried adding+1 to the AGE variable but reghdfe still ommitts it.

    Best regards

  • #2
    Luca:
    it depends on the way -age- is defined.
    If I tell Stata that my firm was founded in 1984 (and this value holds for all the observations the panel is composed of), this variable will be wiped out by the -fe- estimator.
    In addition to my guess-work, posting what you typed and what Stata gave you back, including the outcome tables (as per FAQ), woud help enormously. Thanks.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Originally posted by Carlo Lazzaro View Post
      Luca:
      it depends on the way -age- is defined.
      If I tell Stata that my firm was founded in 1984 (and this value holds for all the observations the panel is composed of), this variable will be wiped out by the -fe- estimator.
      In addition to my guess-work, posting what you typed and what Stata gave you back, including the outcome tables (as per FAQ), woud help enormously. Thanks.
      Dear Carlo,

      I did not want to lengthen the post. Here are the full outcome tables:

      Code:
      . reghdfe Y1 x1 x2 x3 AGE c2 c3 c4 c5 c6 c7 c8 c9, absorb(FIRM Year) cluster(BUSINESSECTOR)
      (dropped 4 singleton observations)
      note: AGE is probably collinear with the fixed effects (all partialled-out values are close to zero; tol = 1.0e-09)
      (MWFE estimator converged in 6 iterations)
      note: AGE omitted because of collinearity
      
      HDFE Linear regression                            Number of obs   =        944
      Absorbing 2 HDFE groups                           F(  11,     25) =       8.94
      Statistics robust to heteroskedasticity           Prob > F        =     0.0000
                                                        R-squared       =     0.8609
                                                        Adj R-squared   =     0.8173
                                                        Within R-sq.    =     0.0586
      Number of clusters (BUSINESSECTOR) =         26   Root MSE        =     0.1304
      
                               (Std. err. adjusted for 26 clusters in BUSINESSECTOR)
      ------------------------------------------------------------------------------
                   |               Robust
                Y1 | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
      -------------+----------------------------------------------------------------
                x1 |   .0263257   .0192146     1.37   0.183    -.0132475    .0658989
                x2 |  -.0210715    .014555    -1.45   0.160    -.0510481    .0089051
                x3 |  -.0019838   .0005888    -3.37   0.002    -.0031965   -.0007712
               AGE |          0  (omitted)
                c2 |  -.0177866   .0095819    -1.86   0.075    -.0375209    .0019477
                c3 |   .0004097   .0052188     0.08   0.938    -.0103386     .011158
                c4 |   .0055606   .0224549     0.25   0.806    -.0406862    .0518073
                c5 |   .0400744   .0306181     1.31   0.202    -.0229848    .1031335
                c6 |  -.0003865   .0011555    -0.33   0.741    -.0027663    .0019933
                c7 |  -.1230092   .0740836    -1.66   0.109    -.2755872    .0295688
                c8 |   .0000965   .0000797     1.21   0.237    -.0000676    .0002606
                c9 |   .0065437   .0030366     2.15   0.041     .0002896    .0127977
             _cons |  -1.405528   .5111448    -2.75   0.011     -2.45825   -.3528057
      
      
       reghdfe Y1 x1 x2 x3 AGE_w c2 c3 c4 c5 c6 c7 c8 c9, absorb(FIRM Year) cluster(BUSINESSECTOR)
      (dropped 4 singleton observations)
      note: AGE_w is probably collinear with the fixed effects (all partialled-out values are close to zero; tol = 1.0e-09)
      (MWFE estimator converged in 6 iterations)
      note: AGE_w omitted because of collinearity
      
      HDFE Linear regression                            Number of obs   =        944
      Absorbing 2 HDFE groups                           F(  11,     25) =       8.94
      Statistics robust to heteroskedasticity           Prob > F        =     0.0000
                                                        R-squared       =     0.8609
                                                        Adj R-squared   =     0.8173
                                                        Within R-sq.    =     0.0586
      Number of clusters (BUSINESSECTOR) =         26   Root MSE        =     0.1304
      
                               (Std. err. adjusted for 26 clusters in BUSINESSECTOR)
      ------------------------------------------------------------------------------
                   |               Robust
                Y1 | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
      -------------+----------------------------------------------------------------
                x1 |   .0263257   .0192146     1.37   0.183    -.0132475    .0658989
                x2 |  -.0210715    .014555    -1.45   0.160    -.0510481    .0089051
                x3 |  -.0019838   .0005888    -3.37   0.002    -.0031965   -.0007712
             AGE_w |          0  (omitted)
                c2 |  -.0177866   .0095819    -1.86   0.075    -.0375209    .0019477
                c3 |   .0004097   .0052188     0.08   0.938    -.0103386     .011158
                c4 |   .0055606   .0224549     0.25   0.806    -.0406862    .0518073
                c5 |   .0400744   .0306181     1.31   0.202    -.0229848    .1031335
                c6 |  -.0003865   .0011555    -0.33   0.741    -.0027663    .0019933
                c7 |  -.1230092   .0740836    -1.66   0.109    -.2755872    .0295688
                c8 |   .0000965   .0000797     1.21   0.237    -.0000676    .0002606
                c9 |   .0065437   .0030366     2.15   0.041     .0002896    .0127977
             _cons |  -1.405528   .5111448    -2.75   0.011     -2.45825   -.3528057
      ------------------------------------------------------------------------------
      
      . xtreg  Y1 x1 x2 x3 AGE c2 c3 c4 c5 c6 c7 c8 c9 i.Year, fe vce(cluster BUSINESSECTOR)
      note: 2021.Year omitted because of collinearity
      
      Fixed-effects (within) regression               Number of obs      =       948
      Group variable: FIRM                            Number of groups   =       214
      
      R-sq:  Within  = 0.4600                         Obs per group: min =         1
             Between = 0.0493                                        avg =       4.4
             Overall = 0.0165                                        max =         5
      
                                                      F(15,25)           =    135.23
      corr(u_i, Xb)  = -0.9009                        Prob > F           =    0.0000
      
                               (Std. err. adjusted for 26 clusters in BUSINESSECTOR)
      ------------------------------------------------------------------------------
                   |               Robust
                Y1 | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
      -------------+----------------------------------------------------------------
                x1 |   .0263257   .0192139     1.37   0.183    -.0132462    .0658976
                x2 |  -.0210715   .0145545    -1.45   0.160    -.0510471     .008904
                x3 |  -.0019838   .0005888    -3.37   0.002    -.0031964   -.0007713
               AGE |   .0702716   .0075243     9.34   0.000      .054775    .0857683
                c2 |  -.0177866   .0095816    -1.86   0.075    -.0375202    .0019471
                c3 |   .0004097   .0052186     0.08   0.938    -.0103382    .0111577
                c4 |   .0055606   .0224541     0.25   0.806    -.0406846    .0518057
                c5 |   .0400744    .030617     1.31   0.202    -.0229826    .1031314
                c6 |  -.0003865   .0011555    -0.33   0.741    -.0027663    .0019932
                c7 |  -.1230092   .0740811    -1.66   0.109     -.275582    .0295636
                c8 |   .0000965   .0000797     1.21   0.237    -.0000676    .0002606
                c9 |   .0065437   .0030365     2.15   0.041     .0002898    .0127975
                   |
              Year |
             2018  |  -.0288623   .0130428    -2.21   0.036    -.0557245   -.0020001
             2019  |  -.0349374   .0110424    -3.16   0.004    -.0576797   -.0121951
             2020  |   .0417026   .0108298     3.85   0.001     .0193982     .064007
             2021  |          0  (omitted)
                   |
             _cons |  -2.863559   .4736202    -6.05   0.000    -3.838998    -1.88812
      -------------+----------------------------------------------------------------
           sigma_u |  .67287709
           sigma_e |  .13030864
               rho |  .96385192   (fraction of variance due to u_i)
      ------------------------------------------------------------------------------
      Regarding the definition:
      I guess it is as you defined it, is there any other way? A firms inception date stays the same each year and each year +1 is added.

      What I mean is, every paper defines it as "the number of years since the security is covered by the database".
      Last edited by Luca Haseney; 12 Jan 2023, 06:47.

      Comment


      • #4
        Luca:
        a possible explanation is that, in -reghdfe- -age- is perfectly collinear with -year- fixed effect, that you explicitly want to get.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Dear Carlo,

          I am pretty sure it is. But I just ask myself why it happens only in the reghdfe, but not in xtreg? And further, how can all the papers include it with time and firm fixed effects?

          Cross posted here:
          https://stats.stackexchange.com/ques...1115307_601741
          Last edited by Luca Haseney; 12 Jan 2023, 10:56. Reason: added cross posting link

          Comment


          • #6
            Hi Luca,
            Both REGHDFE and XTREG, FE actually have the same problem, but handle it differently.
            xtreg drops one of the year fixed effects, but keeps age

            reghdfe, in contrast, drops age because it "keeps" all year fixed effects.

            In fact, I believe that if you add year fixed effect first, and then age (with xtreg), you will get the same result of "age" being dropped as with reghdfe.
            HTH
            F

            Comment

            Working...
            X