xtreg, fe vs. factor variable inclusion: (Factor) Variables omitted because of collinearity (dummy variable trap?)

Roman Vanderson

Join Date: Jan 2017
Posts: 20

xtreg, fe vs. factor variable inclusion: (Factor) Variables omitted because of collinearity (dummy variable trap?)

17 Mar 2017, 02:46

Dear Stata-Listers,

INFO:
I conduct research on unexplained (by firm performance) CEO compensation (variable UCOMP). To analyse if these unexplained parts of compensation are informative about future firm performance I want to run a regression with industry and year fixed effects.

The dataset has been trimmed to the fiscal years (fyear) 2000-2005, as there is a regulatory change in 2003 that I want to utilise to introduce exogeneity and to introduce a difference-in-difference perspective in the analysis (which in a second step of the analysis should be further extended). The indicator variable POST equals 1 for the time frame 2003-2005 and 0 for 2000-2002. Moreover, as I need to establish a certain level of CEO tenure to wipe out effects related to the first year in office and to establish a first difference, observations are only included in the sample if tenure is >= 3 years. This, however, leads to a unbalanced sample with gaps as for each firm only some fiscal years are included, which consequently results in a unbalanced sample with gaps.

Code:

xtset

HTML Code:

. xtset
       panel variable:  gvkey (unbalanced)
        time variable:  fyear, 2000 to 2005, but with gaps
                delta:  1 year

PROBLEM:
According to what I read about running regressions in Stata, xtreg, fe should produce the same results as reg with factor variable inclusion. This is however not the case, which I suspect is due to the fact that xtreg, fe uses gvkey (firm id) as panel variable; i.e. producing firm fixed effects instead of industry fixed effects. Am I right? Note that I cannot xtset industry fixed effects variable sic_Comp_2d contains 2-digit SIC codes to classify the firms industry) as this is not a unique identifier, as I suspect.

Code:

xtset sic_Comp_2d fyear

HTML Code:

. xtset sic_Comp_2d fyear
repeated time values within panel
r(451);

When running a regression using xtreg, fe all firm fixed effects are omitted.

Code:

xtreg D_ROE_lead1_win c.UCOMP##i.POST D_RET_win D_ROE_win D_logSALES_by2002_win i.sic_Comp_2d, fe vce(r)

Note: the prefix "D_" indicates that the variables are in first differences form which I generated before introducing tenure, which cut the sample size, to avoid Stata creating first differences that then relate to the last observation (as there are gaps in the data due to the tenure precondition, as said above) and not to the last year.

HTML Code:

. xtreg D_ROE_lead1_win c.UCOMP##i.POST D_RET_win D_ROE_win D_logSALES_by2002_win i.sic_Comp_2d, fe vce(r)
note: 10.sic_Comp_2d omitted because of collinearity
note: 13.sic_Comp_2d omitted because of collinearity
note: 14.sic_Comp_2d omitted because of collinearity
note: 15.sic_Comp_2d omitted because of collinearity
note: 16.sic_Comp_2d omitted because of collinearity
note: 20.sic_Comp_2d omitted because of collinearity
note: 21.sic_Comp_2d omitted because of collinearity
note: 22.sic_Comp_2d omitted because of collinearity
note: 23.sic_Comp_2d omitted because of collinearity
note: 24.sic_Comp_2d omitted because of collinearity
note: 25.sic_Comp_2d omitted because of collinearity
note: 26.sic_Comp_2d omitted because of collinearity
note: 27.sic_Comp_2d omitted because of collinearity
note: 28.sic_Comp_2d omitted because of collinearity
note: 29.sic_Comp_2d omitted because of collinearity
note: 30.sic_Comp_2d omitted because of collinearity
note: 31.sic_Comp_2d omitted because of collinearity
note: 32.sic_Comp_2d omitted because of collinearity
note: 33.sic_Comp_2d omitted because of collinearity
note: 34.sic_Comp_2d omitted because of collinearity
note: 35.sic_Comp_2d omitted because of collinearity
note: 36.sic_Comp_2d omitted because of collinearity
note: 37.sic_Comp_2d omitted because of collinearity
note: 38.sic_Comp_2d omitted because of collinearity
note: 39.sic_Comp_2d omitted because of collinearity
note: 40.sic_Comp_2d omitted because of collinearity
note: 42.sic_Comp_2d omitted because of collinearity
note: 44.sic_Comp_2d omitted because of collinearity
note: 45.sic_Comp_2d omitted because of collinearity
note: 47.sic_Comp_2d omitted because of collinearity
note: 48.sic_Comp_2d omitted because of collinearity
note: 49.sic_Comp_2d omitted because of collinearity
note: 50.sic_Comp_2d omitted because of collinearity
note: 51.sic_Comp_2d omitted because of collinearity
note: 52.sic_Comp_2d omitted because of collinearity
note: 53.sic_Comp_2d omitted because of collinearity
note: 54.sic_Comp_2d omitted because of collinearity
note: 55.sic_Comp_2d omitted because of collinearity
note: 56.sic_Comp_2d omitted because of collinearity
note: 57.sic_Comp_2d omitted because of collinearity
note: 58.sic_Comp_2d omitted because of collinearity
note: 59.sic_Comp_2d omitted because of collinearity
note: 60.sic_Comp_2d omitted because of collinearity
note: 61.sic_Comp_2d omitted because of collinearity
note: 62.sic_Comp_2d omitted because of collinearity
note: 63.sic_Comp_2d omitted because of collinearity
note: 64.sic_Comp_2d omitted because of collinearity
note: 67.sic_Comp_2d omitted because of collinearity
note: 70.sic_Comp_2d omitted because of collinearity
note: 72.sic_Comp_2d omitted because of collinearity
note: 73.sic_Comp_2d omitted because of collinearity
note: 75.sic_Comp_2d omitted because of collinearity
note: 78.sic_Comp_2d omitted because of collinearity
note: 79.sic_Comp_2d omitted because of collinearity
note: 80.sic_Comp_2d omitted because of collinearity
note: 82.sic_Comp_2d omitted because of collinearity
note: 83.sic_Comp_2d omitted because of collinearity
note: 87.sic_Comp_2d omitted because of collinearity
note: 99.sic_Comp_2d omitted because of collinearity

Fixed-effects (within) regression               Number of obs     =      4,387
Group variable: gvkey                           Number of groups  =        946

R-sq:                                           Obs per group:
     within  = 0.1543                                         min =          1
     between = 0.0841                                         avg =        4.6
     overall = 0.1266                                         max =          6

                                                F(6,945)          =      25.44
corr(u_i, Xb)  = -0.1115                        Prob > F          =     0.0000

                                         (Std. Err. adjusted for 946 clusters in gvkey)
---------------------------------------------------------------------------------------
                      |               Robust
      D_ROE_lead1_win |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
----------------------+----------------------------------------------------------------
                UCOMP |   .0384001   .0151404     2.54   0.011     .0086874    .0681127
               1.POST |   .0384032   .0066631     5.76   0.000      .025327    .0514795
                      |
         POST#c.UCOMP |
                   1  |  -.0362169   .0188553    -1.92   0.055    -.0732199    .0007862
                      |
            D_RET_win |   .0318201   .0052504     6.06   0.000     .0215164    .0421238
            D_ROE_win |  -.3665595   .0403363    -9.09   0.000    -.4457185   -.2874004
D_logSALES_by2002_win |  -.0649946   .0282786    -2.30   0.022    -.1204908   -.0094984
                      |
          sic_Comp_2d |
                  10  |          0  (omitted)
                  13  |          0  (omitted)
                  14  |          0  (omitted)
                  15  |          0  (omitted)
                  16  |          0  (omitted)
                  20  |          0  (omitted)
                  21  |          0  (omitted)
                  22  |          0  (omitted)
                  23  |          0  (omitted)
                  24  |          0  (omitted)
                  25  |          0  (omitted)
                  26  |          0  (omitted)
                  27  |          0  (omitted)
                  28  |          0  (omitted)
                  29  |          0  (omitted)
                  30  |          0  (omitted)
                  31  |          0  (omitted)
                  32  |          0  (omitted)
                  33  |          0  (omitted)
                  34  |          0  (omitted)
                  35  |          0  (omitted)
                  36  |          0  (omitted)
                  37  |          0  (omitted)
                  38  |          0  (omitted)
                  39  |          0  (omitted)
                  40  |          0  (omitted)
                  42  |          0  (omitted)
                  44  |          0  (omitted)
                  45  |          0  (omitted)
                  47  |          0  (omitted)
                  48  |          0  (omitted)
                  49  |          0  (omitted)
                  50  |          0  (omitted)
                  51  |          0  (omitted)
                  52  |          0  (omitted)
                  53  |          0  (omitted)
                  54  |          0  (omitted)
                  55  |          0  (omitted)
                  56  |          0  (omitted)
                  57  |          0  (omitted)
                  58  |          0  (omitted)
                  59  |          0  (omitted)
                  60  |          0  (omitted)
                  61  |          0  (omitted)
                  62  |          0  (omitted)
                  63  |          0  (omitted)
                  64  |          0  (omitted)
                  67  |          0  (omitted)
                  70  |          0  (omitted)
                  72  |          0  (omitted)
                  73  |          0  (omitted)
                  75  |          0  (omitted)
                  78  |          0  (omitted)
                  79  |          0  (omitted)
                  80  |          0  (omitted)
                  82  |          0  (omitted)
                  83  |          0  (omitted)
                  87  |          0  (omitted)
                  99  |          0  (omitted)
                      |
                _cons |  -.0191775   .0036551    -5.25   0.000    -.0263505   -.0120045
----------------------+----------------------------------------------------------------
              sigma_u |   .0848593
              sigma_e |  .19574748
                  rho |  .15820274   (fraction of variance due to u_i)
---------------------------------------------------------------------------------------

.

I do understand, if anything, that this result is just logical as the industry effects include several firms and, out of the perspective of firm level the industry dummies are constants that are to be omitted. The problem now is that the factor variable 2005.fyear is omitted if using the reg command with factor variables instead:

Code:

reg D_ROE_lead1_win c.UCOMP##i.POST D_RET_win D_ROE_win D_logSALES_by2002_win i.sic_Comp_2d i.fyear, vce(r)

HTML Code:

. reg D_ROE_lead1_win c.UCOMP##i.POST D_RET_win D_ROE_win D_logSALES_by2002_win i.sic_Comp_2d i.fyear, vce(r)
note: 2005.fyear omitted because of collinearity

Linear regression                               Number of obs     =      4,387
                                                F(69, 4317)       =       6.76
                                                Prob > F          =     0.0000
                                                R-squared         =     0.1559
                                                Root MSE          =     .18604

---------------------------------------------------------------------------------------
                      |               Robust
      D_ROE_lead1_win |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
----------------------+----------------------------------------------------------------
                UCOMP |   .0392188   .0124657     3.15   0.002     .0147797    .0636579
               1.POST |   .0662009   .0094289     7.02   0.000     .0477155    .0846863
                      |
         POST#c.UCOMP |
                   1  |   -.028908   .0162057    -1.78   0.075    -.0606794    .0028634
                      |
            D_RET_win |   .0384284    .006121     6.28   0.000     .0264281    .0504287
            D_ROE_win |  -.3370546   .0424834    -7.93   0.000     -.420344   -.2537652
D_logSALES_by2002_win |  -.0193767   .0216582    -0.89   0.371    -.0618379    .0230845
                      |
          sic_Comp_2d |
                  10  |   .2436253   .1580442     1.54   0.123    -.0662224     .553473
                  13  |   .1139722   .0238269     4.78   0.000     .0672592    .1606853
                  14  |   .1044493   .0233721     4.47   0.000     .0586279    .1502707
                  15  |   .0847084   .0245002     3.46   0.001     .0366754    .1327414
                  16  |   .0743151   .0263457     2.82   0.005     .0226641    .1259662
                  20  |   .0680565   .0274059     2.48   0.013     .0143269    .1217862
                  21  |   .0943653   .2404107     0.39   0.695    -.3769632    .5656938
                  22  |   .1015676   .0380867     2.67   0.008      .026898    .1762371
                  23  |   .0848705   .0238377     3.56   0.000     .0381364    .1316046
                  24  |   .0723909   .0450312     1.61   0.108    -.0158935    .1606752
                  25  |   .0419734    .040647     1.03   0.302    -.0377156    .1216624
                  26  |   .0652542   .0317417     2.06   0.040     .0030242    .1274841
                  27  |   .0824918   .0384155     2.15   0.032     .0071777    .1578059
                  28  |   .0929376   .0261299     3.56   0.000     .0417096    .1441655
                  29  |   .1157699   .0265928     4.35   0.000     .0636343    .1679054
                  30  |   .0748228   .0489669     1.53   0.127    -.0211775     .170823
                  31  |   .0885334   .0245154     3.61   0.000     .0404707    .1365961
                  32  |   .0510022   .0456162     1.12   0.264    -.0384291    .1404334
                  33  |   .1298256   .0300498     4.32   0.000     .0709125    .1887387
                  34  |   .0865543    .024173     3.58   0.000     .0391627    .1339458
                  35  |    .093948   .0247596     3.79   0.000     .0454064    .1424896
                  36  |   .0741927    .024938     2.98   0.003     .0253014     .123084
                  37  |   .0903146     .02693     3.35   0.001      .037518    .1431112
                  38  |    .082673   .0245323     3.37   0.001      .034577     .130769
                  39  |   .1010992   .0340923     2.97   0.003     .0342608    .1679377
                  40  |    .084477   .0286971     2.94   0.003      .028216    .1407379
                  42  |   .0976769   .0334222     2.92   0.003     .0321523    .1632015
                  44  |   .1074021   .0249236     4.31   0.000      .058539    .1562651
                  45  |   .1135235   .0319811     3.55   0.000     .0508241     .176223
                  47  |   .0975938   .0288536     3.38   0.001      .041026    .1541616
                  48  |   .0645261    .042979     1.50   0.133    -.0197348    .1487869
                  49  |   .0887147   .0233845     3.79   0.000     .0428691    .1345602
                  50  |   .1040087    .023811     4.37   0.000     .0573268    .1506906
                  51  |   .0643401    .041928     1.53   0.125    -.0178603    .1465406
                  52  |    .105319   .0247658     4.25   0.000     .0567653    .1538728
                  53  |   .0968133   .0240522     4.03   0.000     .0496587    .1439679
                  54  |   .1008545   .0456592     2.21   0.027      .011339      .19037
                  55  |   .1098666   .0266982     4.12   0.000     .0575245    .1622087
                  56  |   .0860991    .023642     3.64   0.000     .0397487    .1324496
                  57  |   .0505561   .0381876     1.32   0.186    -.0243112    .1254234
                  58  |   .0882977   .0239088     3.69   0.000      .041424    .1351713
                  59  |   .0826475   .0255252     3.24   0.001     .0326051    .1326899
                  60  |   .0884345   .0222855     3.97   0.000     .0447434    .1321257
                  61  |   .1044897   .0305666     3.42   0.001     .0445634     .164416
                  62  |   .0802998    .024233     3.31   0.001     .0327906     .127809
                  63  |     .11781   .0232785     5.06   0.000     .0721722    .1634478
                  64  |   .0675812   .0288246     2.34   0.019     .0110701    .1240922
                  67  |   .1105102     .02418     4.57   0.000     .0631051    .1579153
                  70  |   .0997014   .0240329     4.15   0.000     .0525846    .1468182
                  72  |    .009463   .0574141     0.16   0.869    -.1030981    .1220241
                  73  |   .1141819   .0250713     4.55   0.000     .0650292    .1633345
                  75  |   .0222996   .0664288     0.34   0.737     -.107935    .1525343
                  78  |   .0991467   .0343885     2.88   0.004     .0317275    .1665658
                  79  |   .1238005   .0435254     2.84   0.004     .0384684    .2091326
                  80  |   .0619407   .0309974     2.00   0.046       .00117    .1227115
                  82  |   .1071629   .0286018     3.75   0.000     .0510887     .163237
                  83  |   .0866183   .0333891     2.59   0.010     .0211584    .1520781
                  87  |   .0763192    .027526     2.77   0.006      .022354    .1302843
                  99  |   .1104999   .0311492     3.55   0.000     .0494316    .1715682
                      |
                fyear |
                2001  |   .0381889   .0109164     3.50   0.000      .016787    .0595907
                2002  |   .0922355   .0113675     8.11   0.000     .0699493    .1145216
                2003  |   .0217092   .0085836     2.53   0.011     .0048809    .0385374
                2004  |   .0221656   .0085377     2.60   0.009     .0054272     .038904
                2005  |          0  (omitted)
                      |
                _cons |  -.1553799   .0222875    -6.97   0.000    -.1990748   -.1116849
---------------------------------------------------------------------------------------

.

I assume this is due to the collinearity of POST and fyear:

Code:

pwcorr POST fyear, star(.01)

HTML Code:

. pwcorr POST fyear, star(.01)

             |     POST    fyear
-------------+------------------
        POST |   1.0000 
       fyear |   0.8764*  1.0000 

.

Code:

reg POST UCOMP D_RET_win D_ROE_win D_logSALES_by2002_win i.sic_Comp_2d i.fyear, vce(r)

HTML Code:

. reg POST UCOMP D_RET_win D_ROE_win D_logSALES_by2002_win i.sic_Comp_2d i.fyear, vce(r)

Linear regression                               Number of obs     =      4,387
                                                F(0, 4318)        =          .
                                                Prob > F          =          .
                                                R-squared         =     1.0000
                                                Root MSE          =          0

---------------------------------------------------------------------------------------
                      |               Robust
                 POST |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
----------------------+----------------------------------------------------------------
                UCOMP |   1.85e-16   2.73e-16     0.68   0.498    -3.51e-16    7.21e-16
            D_RET_win |  -6.49e-16   1.76e-16    -3.69   0.000    -9.94e-16   -3.04e-16
            D_ROE_win |  -7.83e-15   8.26e-16    -9.47   0.000    -9.45e-15   -6.21e-15
D_logSALES_by2002_win |   2.34e-15   5.26e-16     4.45   0.000     1.31e-15    3.37e-15
                      |
          sic_Comp_2d |
                  10  |  -2.79e-13   1.94e-13    -1.44   0.151    -6.60e-13    1.02e-13
                  13  |  -2.80e-13   1.94e-13    -1.44   0.149    -6.62e-13    1.01e-13
                  14  |  -2.80e-13   1.94e-13    -1.44   0.149    -6.61e-13    1.01e-13
                  15  |  -2.80e-13   1.94e-13    -1.44   0.149    -6.62e-13    1.01e-13
                  16  |  -2.80e-13   1.94e-13    -1.44   0.150    -6.61e-13    1.01e-13
                  20  |  -2.80e-13   1.94e-13    -1.44   0.149    -6.61e-13    1.01e-13
                  21  |  -2.80e-13   1.94e-13    -1.44   0.150    -6.61e-13    1.01e-13
                  22  |  -2.79e-13   1.94e-13    -1.43   0.152    -6.60e-13    1.02e-13
                  23  |  -2.80e-13   1.94e-13    -1.44   0.150    -6.61e-13    1.01e-13
                  24  |  -2.81e-13   1.94e-13    -1.44   0.149    -6.62e-13    1.00e-13
                  25  |  -2.80e-13   1.94e-13    -1.44   0.149    -6.61e-13    1.01e-13
                  26  |  -2.81e-13   1.94e-13    -1.44   0.149    -6.62e-13    1.01e-13
                  27  |  -2.80e-13   1.94e-13    -1.44   0.149    -6.62e-13    1.01e-13
                  28  |  -2.81e-13   1.94e-13    -1.44   0.149    -6.62e-13    1.00e-13
                  29  |  -2.81e-13   1.94e-13    -1.44   0.149    -6.62e-13    1.01e-13
                  30  |  -2.81e-13   1.94e-13    -1.44   0.149    -6.62e-13    1.01e-13
                  31  |  -2.82e-13   1.94e-13    -1.45   0.147    -6.63e-13    9.94e-14
                  32  |  -2.80e-13   1.94e-13    -1.44   0.150    -6.61e-13    1.01e-13
                  33  |  -2.80e-13   1.94e-13    -1.44   0.150    -6.61e-13    1.01e-13
                  34  |  -2.80e-13   1.94e-13    -1.44   0.150    -6.61e-13    1.01e-13
                  35  |  -2.81e-13   1.94e-13    -1.45   0.148    -6.63e-13    9.97e-14
                  36  |  -2.81e-13   1.94e-13    -1.44   0.149    -6.62e-13    1.01e-13
                  37  |  -2.81e-13   1.94e-13    -1.45   0.148    -6.62e-13    1.00e-13
                  38  |  -2.82e-13   1.94e-13    -1.45   0.147    -6.63e-13    9.95e-14
                  39  |  -2.80e-13   1.94e-13    -1.44   0.150    -6.61e-13    1.01e-13
                  40  |  -2.79e-13   1.94e-13    -1.44   0.151    -6.60e-13    1.02e-13
                  42  |  -2.80e-13   1.94e-13    -1.44   0.149    -6.61e-13    1.01e-13
                  44  |  -2.80e-13   1.94e-13    -1.44   0.150    -6.61e-13    1.01e-13
                  45  |  -2.80e-13   1.94e-13    -1.44   0.149    -6.62e-13    1.01e-13
                  47  |  -2.80e-13   1.94e-13    -1.44   0.150    -6.61e-13    1.01e-13
                  48  |  -2.80e-13   1.94e-13    -1.44   0.150    -6.61e-13    1.01e-13
                  49  |  -2.80e-13   1.94e-13    -1.44   0.150    -6.61e-13    1.01e-13
                  50  |  -2.80e-13   1.94e-13    -1.44   0.149    -6.62e-13    1.01e-13
                  51  |  -2.81e-13   1.94e-13    -1.44   0.149    -6.62e-13    1.00e-13
                  52  |  -2.80e-13   1.94e-13    -1.44   0.150    -6.61e-13    1.01e-13
                  53  |  -2.81e-13   1.94e-13    -1.44   0.149    -6.62e-13    1.01e-13
                  54  |  -2.80e-13   1.94e-13    -1.44   0.150    -6.61e-13    1.01e-13
                  55  |  -2.79e-13   1.94e-13    -1.44   0.151    -6.60e-13    1.02e-13
                  56  |  -2.80e-13   1.94e-13    -1.44   0.150    -6.61e-13    1.01e-13
                  57  |  -2.81e-13   1.94e-13    -1.45   0.148    -6.62e-13    1.00e-13
                  58  |  -2.80e-13   1.94e-13    -1.44   0.149    -6.62e-13    1.01e-13
                  59  |  -2.80e-13   1.94e-13    -1.44   0.149    -6.62e-13    1.01e-13
                  60  |  -2.82e-13   1.94e-13    -1.45   0.147    -6.63e-13    9.95e-14
                  61  |  -2.81e-13   1.94e-13    -1.44   0.149    -6.62e-13    1.00e-13
                  62  |  -2.80e-13   1.94e-13    -1.44   0.150    -6.61e-13    1.01e-13
                  63  |  -2.81e-13   1.94e-13    -1.45   0.148    -6.63e-13    9.96e-14
                  64  |  -2.81e-13   1.94e-13    -1.44   0.149    -6.62e-13    1.01e-13
                  67  |  -2.81e-13   1.94e-13    -1.44   0.149    -6.62e-13    1.00e-13
                  70  |  -2.81e-13   1.94e-13    -1.45   0.148    -6.62e-13    1.00e-13
                  72  |  -2.80e-13   1.94e-13    -1.44   0.150    -6.61e-13    1.01e-13
                  73  |  -2.81e-13   1.94e-13    -1.45   0.148    -6.62e-13    1.00e-13
                  75  |  -2.80e-13   1.94e-13    -1.44   0.150    -6.61e-13    1.01e-13
                  78  |  -2.80e-13   1.94e-13    -1.44   0.150    -6.61e-13    1.01e-13
                  79  |  -2.78e-13   1.94e-13    -1.43   0.152    -6.59e-13    1.03e-13
                  80  |  -2.80e-13   1.94e-13    -1.44   0.150    -6.61e-13    1.01e-13
                  82  |  -2.81e-13   1.94e-13    -1.45   0.148    -6.62e-13    1.00e-13
                  83  |  -2.81e-13   1.94e-13    -1.44   0.149    -6.62e-13    1.00e-13
                  87  |  -2.81e-13   1.94e-13    -1.44   0.149    -6.62e-13    1.00e-13
                  99  |  -2.79e-13   1.94e-13    -1.43   0.152    -6.60e-13    1.02e-13
                      |
                fyear |
                2001  |   1.38e-14   5.47e-16    25.19   0.000     1.27e-14    1.48e-14
                2002  |   1.48e-14   5.63e-16    26.35   0.000     1.37e-14    1.59e-14
                2003  |          1   5.73e-16  1.7e+15   0.000            1           1
                2004  |          1   5.54e-16  1.8e+15   0.000            1           1
                2005  |          1   7.22e-16  1.4e+15   0.000            1           1
                      |
                _cons |   2.68e-13   1.94e-13     1.38   0.168    -1.13e-13    6.49e-13
---------------------------------------------------------------------------------------

.

I am afraid I have to show off extra ordinary levels of lacking Stata as well as general statistic literacy now, but: Is there anything I can do about this? If yes, what? Please, guide me through the steps. If no, – and this is obviously key to me – are the current results of any use or, well, just elaborate waste?

For your help I thank you very much in advance!

Best regards,
Roman

Tags: None

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17742
#2

17 Mar 2017, 03:08

Roman:
I'm not clear with the kind of help you're seking.
Anyway, some remarks about your models follow below:
- in -xtreg, fe- the omission of -i.sic_Comp_2d- is due to the colinearity with the fixed effect;
- in your first -regress- model, the omission of -2005.fyear- due to collinearity is of no concern. However, you should have used clustered standard errors as you have non-independent observations (i.e.: multiple observations for the same id). Please note that, unlike -xtreg-, -regress- rubustified and clustered standard errors accomplish different jobs;
- I find difficult to follow what you're after in your last -regress- model.

Kind regards,
Carlo
(Stata 19.0)
Comment

Roman Vanderson

Join Date: Jan 2017
Posts: 20

18 Mar 2017, 05:37

Dear Carlo,

Thank you very much indeed for your quick and comprehensive reply. It's very appreciated! Moreover, I hope you excuse my late response. I was off-site immediately after sending my request and received your reply on the go. I was able to log in to Statalist via iOS (Safari) but unfortunately not able to comment/reply. Thus the delay.

Regarding my cry for help and your reply: You did get it all right, I was seeking confirmation that
(A) -xtreg, fe- causes a conflict/collinearity of i.sic_Comp_2d and the fixed effect (of firm id; i.e. gvkey), and
(B) whether the second attempt – regress with factor variable inclusion – does indeed the same thing but avoids the fixed effect of firm id and thus the collinearity.
So your comment helped me a lot! Thank you for that!

Moreover, you did point out that the robust option indeed differs for -xtreg,fe- and -regress-. This is very valuable information, as I didn't know that. VERY APPRECIATED!

Here's the output again with clustered standard errors:

Code:

reg D_ROE_lead1_win c.UCOMP##i.POST D_RET_win D_ROE_win D_logSALES_by2002_win i.sic_Comp_2d i.fyear, vce(cl gvkey)

HTML Code:

. reg D_ROE_lead1_win c.UCOMP##i.POST D_RET_win D_ROE_win D_logSALES_by2002_win i.sic_Comp_2d i.fyear, vce(cl gvkey)
note: 2005.fyear omitted because of collinearity

Linear regression                               Number of obs     =      4,387
                                                F(65, 945)        =          .
                                                Prob > F          =          .
                                                R-squared         =     0.1559
                                                Root MSE          =     .18604

                                         (Std. Err. adjusted for 946 clusters in gvkey)
---------------------------------------------------------------------------------------
                      |               Robust
      D_ROE_lead1_win |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
----------------------+----------------------------------------------------------------
                UCOMP |   .0392188   .0132057     2.97   0.003      .013303    .0651347
               1.POST |   .0662009   .0093619     7.07   0.000     .0478284    .0845734
                      |
         POST#c.UCOMP |
                   1  |   -.028908   .0152292    -1.90   0.058     -.058795    .0009789
                      |
            D_RET_win |   .0384284   .0050571     7.60   0.000      .028504    .0483527
            D_ROE_win |  -.3370546   .0359816    -9.37   0.000    -.4076677   -.2664415
D_logSALES_by2002_win |  -.0193767   .0229734    -0.84   0.399    -.0644615    .0257081
                      |
          sic_Comp_2d |
                  10  |   .2436253   .0904851     2.69   0.007     .0660503    .4212003
                  13  |   .1139722   .0145393     7.84   0.000     .0854392    .1425053
                  14  |   .1044493   .0118118     8.84   0.000      .081269    .1276296
                  15  |   .0847084   .0136328     6.21   0.000     .0579543    .1114624
                  16  |   .0743151   .0172541     4.31   0.000     .0404543     .108176
                  20  |   .0680565     .01836     3.71   0.000     .0320254    .1040876
                  21  |   .0943653   .0118575     7.96   0.000     .0710952    .1176354
                  22  |   .1015676   .0142468     7.13   0.000     .0736085    .1295266
                  23  |   .0848705   .0135691     6.25   0.000     .0582414    .1114995
                  24  |   .0723909   .0323239     2.24   0.025     .0089559    .1358258
                  25  |   .0419734   .0383512     1.09   0.274    -.0332901    .1172368
                  26  |   .0652542   .0210145     3.11   0.002     .0240136    .1064948
                  27  |   .0824918   .0335675     2.46   0.014     .0166164    .1483672
                  28  |   .0929376   .0182045     5.11   0.000     .0572117    .1286634
                  29  |   .1157699   .0229797     5.04   0.000     .0706727    .1608671
                  30  |   .0748228   .0172444     4.34   0.000      .040981    .1086645
                  31  |   .0885334   .0139456     6.35   0.000     .0611655    .1159013
                  32  |   .0510022   .0312224     1.63   0.103    -.0102711    .1122755
                  33  |   .1298256   .0275608     4.71   0.000     .0757382    .1839131
                  34  |   .0865543   .0137955     6.27   0.000     .0594809    .1136276
                  35  |    .093948   .0138331     6.79   0.000     .0668008    .1210952
                  36  |   .0741927   .0144286     5.14   0.000     .0458768    .1025086
                  37  |   .0903146   .0152661     5.92   0.000     .0603553    .1202739
                  38  |    .082673   .0152364     5.43   0.000     .0527718    .1125742
                  39  |   .1010992   .0302843     3.34   0.001     .0416669    .1605316
                  40  |    .084477    .021442     3.94   0.000     .0423976    .1265564
                  42  |   .0976769   .0181016     5.40   0.000      .062153    .1332008
                  44  |   .1074021   .0169073     6.35   0.000     .0742219    .1405822
                  45  |   .1135235   .0228374     4.97   0.000     .0687055    .1583415
                  47  |   .0975938   .0190232     5.13   0.000     .0602611    .1349265
                  48  |   .0645261   .0184886     3.49   0.001     .0282426    .1008095
                  49  |   .0887147   .0138179     6.42   0.000     .0615973    .1158321
                  50  |   .1040087   .0135199     7.69   0.000     .0774763    .1305412
                  51  |   .0643401   .0198308     3.24   0.001     .0254227    .1032576
                  52  |    .105319    .013054     8.07   0.000     .0797009    .1309372
                  53  |   .0968133    .016446     5.89   0.000     .0645384    .1290883
                  54  |   .1008545    .015222     6.63   0.000     .0709816    .1307274
                  55  |   .1098666    .011989     9.16   0.000     .0863384    .1333948
                  56  |   .0860991   .0135458     6.36   0.000     .0595159    .1126824
                  57  |   .0505561   .0346045     1.46   0.144    -.0173544    .1184666
                  58  |   .0882977   .0149712     5.90   0.000      .058917    .1176783
                  59  |   .0826475      .0183     4.52   0.000     .0467342    .1185608
                  60  |   .0884345   .0122246     7.23   0.000      .064444    .1124251
                  61  |   .1044897   .0134743     7.75   0.000     .0780467    .1309327
                  62  |   .0802998   .0158915     5.05   0.000     .0491131    .1114865
                  63  |     .11781   .0155007     7.60   0.000     .0873903    .1482297
                  64  |   .0675812   .0179879     3.76   0.000     .0322804     .102882
                  67  |   .1105102   .0126424     8.74   0.000     .0856998    .1353206
                  70  |   .0997014    .013076     7.62   0.000     .0740401    .1253627
                  72  |    .009463   .0490606     0.19   0.847    -.0868173    .1057433
                  73  |   .1141819   .0153503     7.44   0.000     .0840572    .1443065
                  75  |   .0222996   .0447652     0.50   0.618    -.0655512    .1101504
                  78  |   .0991467     .01176     8.43   0.000     .0760679    .1222254
                  79  |   .1238005   .0169255     7.31   0.000     .0905845    .1570165
                  80  |   .0619407   .0282015     2.20   0.028     .0065959    .1172856
                  82  |   .1071629   .0215927     4.96   0.000     .0647877     .149538
                  83  |   .0866183   .0139365     6.22   0.000     .0592682    .1139684
                  87  |   .0763192   .0156051     4.89   0.000     .0456944    .1069439
                  99  |   .1104999   .0131672     8.39   0.000     .0846595    .1363403
                      |
                fyear |
                2001  |   .0381889   .0107217     3.56   0.000     .0171477      .05923
                2002  |   .0922355     .01145     8.06   0.000     .0697651    .1147059
                2003  |   .0217092    .008761     2.48   0.013     .0045158    .0389025
                2004  |   .0221656   .0088331     2.51   0.012     .0048308    .0395004
                2005  |          0  (omitted)
                      |
                _cons |  -.1553799    .012632   -12.30   0.000    -.1801699   -.1305898
---------------------------------------------------------------------------------------

.

Please allow me to ask some follow-up questions:

You pointed out that when using -regress- robustified standard errors do a different job than clustered ones. I do get that there is a difference in using the -robust- option versus the -cl- option and that, if I did understand you correctly, -xtreg- does apply those automatically. In using the clustered option of -regress-, as done above, Stata does however report robust standard errors, right? Well, at least it says so in the output. Or should I, as I want the SEs to be heteroscedastic-consistent, combine the two options -cl- and -r-, if actually possible?
In the Stata output of -regression- with clustered SEs above there is no F statistic reported. Is there any way to get a reliable F statistic for the model?
You wrote that the omission of -2005.fyear- due to collinearity in the first -regress- model of my first comment is of no concern. Besides that being really good news (as I need the results for my Master's thesis and wouldn't know where to start over if the model's regression results would have been useless) the practical question arises whether the omission and/or the underlying collinearity is something that should be reported with the regression results or not – if, well, it is really of no concern at all?

Thank you very much in advance!

Kind regards,
Roman

PS: Please ignore the last -regress- model of my first post. Shouldn't have been posted.

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17742
#4

18 Mar 2017, 07:21

Roman:
- first off, one general remark: whenever you're dealing with a panel dataset with a continuous dependent variable, if you're intended to go -fe- (by the way, does the -hausman. test confirm your option vs -re-?) is far better using -xtreg, fe- than -regress-. However, if the F-test creeping up as a footnote of the -xtreg,fe- outcome table (run with default standard errors) lacks statisical significance, pooled OLS (clustered standard errors mandatory) outperforms -xtreg, fe-;
- robustified or clustered standard errors in -xtreg- do the same job, but you should invoke one option or the other explicitly (that is, -xtreg- does not automatically correct for heteroskedasticity and/or autocorrelation (please note that for a large N, small T panel dataset, as the one usually analyzed via -xtreg-, autocorrelation is a minor concern);
- in -regress- there's no way to correct for both heteroskedastcity and autocorrelation. However, if you have panel data and go -regress-, you should -cluster- your standard errors since you do not have independent observations;
- if the F-test value after -regress- with clustered standard errors does not appear, you can click on the hyperlink that appears in blue;
- for your last query there's usually little you can do, but looking for a different specification.

Kind regards,
Carlo
(Stata 19.0)
Comment

Roman Vanderson

Join Date: Jan 2017
Posts: 20

18 Mar 2017, 12:43

Dear Carlo,

Again, thank you very much for your reply and help!

I reply and follow-up on your remarks one after another:

- first off, one general remark: whenever you're dealing with a panel dataset with a continuous dependent variable, if you're intended to go -fe- (by the way, does the -hausman. test confirm your option vs -re-?) is far better using -xtreg, fe- than -regress-. However, if the F-test creeping up as a footnote of the -xtreg,fe- outcome table (run with default standard errors) lacks statisical significance, pooled OLS (clustered standard errors mandatory) outperforms -xtreg, fe-;

Good to know! Wasn't clear to me either. I want to introduce industry and year fixed effects as this is of importance for my analysis. The reason to go -fe- has been a conducted hausman test (I hope I did it correctly: I only included factor variables regarding the industry (-i.sic_Comp_2d-) as -i.fyear- should be covered by the time variable of -xtreg-.), which suggests -fe- in my case. Furthermore, the F-test creeping up as a footnote of -xtreg,fe- very much lacks statistical significance indeed. This then suggests pooled OLS with clustered standard errors according to the knowledge you've been kind enough to share.

HTML Code:

. xtreg D_ROE_lead1_win c.UCOMP##i.POST D_RET_win D_ROE_win D_logSALES_by2002_win i.sic_Comp_2d, fe
note: 10.sic_Comp_2d omitted because of collinearity
note: 13.sic_Comp_2d omitted because of collinearity
note: 14.sic_Comp_2d omitted because of collinearity
note: 15.sic_Comp_2d omitted because of collinearity
note: 16.sic_Comp_2d omitted because of collinearity
note: 20.sic_Comp_2d omitted because of collinearity
note: 21.sic_Comp_2d omitted because of collinearity
note: 22.sic_Comp_2d omitted because of collinearity
note: 23.sic_Comp_2d omitted because of collinearity
note: 24.sic_Comp_2d omitted because of collinearity
note: 25.sic_Comp_2d omitted because of collinearity
note: 26.sic_Comp_2d omitted because of collinearity
note: 27.sic_Comp_2d omitted because of collinearity
note: 28.sic_Comp_2d omitted because of collinearity
note: 29.sic_Comp_2d omitted because of collinearity
note: 30.sic_Comp_2d omitted because of collinearity
note: 31.sic_Comp_2d omitted because of collinearity
note: 32.sic_Comp_2d omitted because of collinearity
note: 33.sic_Comp_2d omitted because of collinearity
note: 34.sic_Comp_2d omitted because of collinearity
note: 35.sic_Comp_2d omitted because of collinearity
note: 36.sic_Comp_2d omitted because of collinearity
note: 37.sic_Comp_2d omitted because of collinearity
note: 38.sic_Comp_2d omitted because of collinearity
note: 39.sic_Comp_2d omitted because of collinearity
note: 40.sic_Comp_2d omitted because of collinearity
note: 42.sic_Comp_2d omitted because of collinearity
note: 44.sic_Comp_2d omitted because of collinearity
note: 45.sic_Comp_2d omitted because of collinearity
note: 47.sic_Comp_2d omitted because of collinearity
note: 48.sic_Comp_2d omitted because of collinearity
note: 49.sic_Comp_2d omitted because of collinearity
note: 50.sic_Comp_2d omitted because of collinearity
note: 51.sic_Comp_2d omitted because of collinearity
note: 52.sic_Comp_2d omitted because of collinearity
note: 53.sic_Comp_2d omitted because of collinearity
note: 54.sic_Comp_2d omitted because of collinearity
note: 55.sic_Comp_2d omitted because of collinearity
note: 56.sic_Comp_2d omitted because of collinearity
note: 57.sic_Comp_2d omitted because of collinearity
note: 58.sic_Comp_2d omitted because of collinearity
note: 59.sic_Comp_2d omitted because of collinearity
note: 60.sic_Comp_2d omitted because of collinearity
note: 61.sic_Comp_2d omitted because of collinearity
note: 62.sic_Comp_2d omitted because of collinearity
note: 63.sic_Comp_2d omitted because of collinearity
note: 64.sic_Comp_2d omitted because of collinearity
note: 67.sic_Comp_2d omitted because of collinearity
note: 70.sic_Comp_2d omitted because of collinearity
note: 72.sic_Comp_2d omitted because of collinearity
note: 73.sic_Comp_2d omitted because of collinearity
note: 75.sic_Comp_2d omitted because of collinearity
note: 78.sic_Comp_2d omitted because of collinearity
note: 79.sic_Comp_2d omitted because of collinearity
note: 80.sic_Comp_2d omitted because of collinearity
note: 82.sic_Comp_2d omitted because of collinearity
note: 83.sic_Comp_2d omitted because of collinearity
note: 87.sic_Comp_2d omitted because of collinearity
note: 99.sic_Comp_2d omitted because of collinearity

Fixed-effects (within) regression               Number of obs     =      4,387
Group variable: gvkey                           Number of groups  =        946

R-sq:                                           Obs per group:
     within  = 0.1543                                         min =          1
     between = 0.0841                                         avg =        4.6
     overall = 0.1266                                         max =          6

                                                F(6,3435)         =     104.43
corr(u_i, Xb)  = -0.1115                        Prob > F          =     0.0000

---------------------------------------------------------------------------------------
      D_ROE_lead1_win |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
----------------------+----------------------------------------------------------------
                UCOMP |   .0384001   .0106273     3.61   0.000     .0175636    .0592365
               1.POST |   .0384032   .0063347     6.06   0.000      .025983    .0508234
                      |
         POST#c.UCOMP |
                   1  |  -.0362169   .0165603    -2.19   0.029     -.068686   -.0037478
                      |
            D_RET_win |   .0318201   .0048044     6.62   0.000     .0224003      .04124
            D_ROE_win |  -.3665595   .0160888   -22.78   0.000     -.398104   -.3350149
D_logSALES_by2002_win |  -.0649946   .0168073    -3.87   0.000     -.097948   -.0320412
                      |
          sic_Comp_2d |
                  10  |          0  (omitted)
                  13  |          0  (omitted)
                  14  |          0  (omitted)
                  15  |          0  (omitted)
                  16  |          0  (omitted)
                  20  |          0  (omitted)
                  21  |          0  (omitted)
                  22  |          0  (omitted)
                  23  |          0  (omitted)
                  24  |          0  (omitted)
                  25  |          0  (omitted)
                  26  |          0  (omitted)
                  27  |          0  (omitted)
                  28  |          0  (omitted)
                  29  |          0  (omitted)
                  30  |          0  (omitted)
                  31  |          0  (omitted)
                  32  |          0  (omitted)
                  33  |          0  (omitted)
                  34  |          0  (omitted)
                  35  |          0  (omitted)
                  36  |          0  (omitted)
                  37  |          0  (omitted)
                  38  |          0  (omitted)
                  39  |          0  (omitted)
                  40  |          0  (omitted)
                  42  |          0  (omitted)
                  44  |          0  (omitted)
                  45  |          0  (omitted)
                  47  |          0  (omitted)
                  48  |          0  (omitted)
                  49  |          0  (omitted)
                  50  |          0  (omitted)
                  51  |          0  (omitted)
                  52  |          0  (omitted)
                  53  |          0  (omitted)
                  54  |          0  (omitted)
                  55  |          0  (omitted)
                  56  |          0  (omitted)
                  57  |          0  (omitted)
                  58  |          0  (omitted)
                  59  |          0  (omitted)
                  60  |          0  (omitted)
                  61  |          0  (omitted)
                  62  |          0  (omitted)
                  63  |          0  (omitted)
                  64  |          0  (omitted)
                  67  |          0  (omitted)
                  70  |          0  (omitted)
                  72  |          0  (omitted)
                  73  |          0  (omitted)
                  75  |          0  (omitted)
                  78  |          0  (omitted)
                  79  |          0  (omitted)
                  80  |          0  (omitted)
                  82  |          0  (omitted)
                  83  |          0  (omitted)
                  87  |          0  (omitted)
                  99  |          0  (omitted)
                      |
                _cons |  -.0191775    .004438    -4.32   0.000    -.0278789    -.010476
----------------------+----------------------------------------------------------------
              sigma_u |   .0848593
              sigma_e |  .19574748
                  rho |  .15820274   (fraction of variance due to u_i)
---------------------------------------------------------------------------------------
F test that all u_i=0: F(945, 3435) = 0.63                   Prob > F = 1.0000

. estimates store fixed

. xtreg D_ROE_lead1_win c.UCOMP##i.POST D_RET_win D_ROE_win D_logSALES_by2002_win i.sic_Comp_2d, re

Random-effects GLS regression                   Number of obs     =      4,387
Group variable: gvkey                           Number of groups  =        946

R-sq:                                           Obs per group:
     within  = 0.1534                                         min =          1
     between = 0.1589                                         avg =        4.6
     overall = 0.1381                                         max =          6

                                                Wald chi2(65)     =     692.24
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000

---------------------------------------------------------------------------------------
      D_ROE_lead1_win |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------------+----------------------------------------------------------------
                UCOMP |   .0391397   .0095312     4.11   0.000     .0204588    .0578205
               1.POST |   .0387399   .0057981     6.68   0.000     .0273758     .050104
                      |
         POST#c.UCOMP |
                   1  |  -.0287319   .0142915    -2.01   0.044    -.0567427   -.0007211
                      |
            D_RET_win |   .0341826   .0044496     7.68   0.000     .0254615    .0429036
            D_ROE_win |  -.3303528   .0143712   -22.99   0.000    -.3585199   -.3021857
D_logSALES_by2002_win |  -.0488908   .0142585    -3.43   0.001     -.076837   -.0209447
                      |
          sic_Comp_2d |
                  10  |   .2721005   .1421165     1.91   0.056    -.0064428    .5506438
                  13  |   .1461873   .1339151     1.09   0.275    -.1162815     .408656
                  14  |   .1331461      .1436     0.93   0.354    -.1483047     .414597
                  15  |   .1184753   .1358495     0.87   0.383    -.1477848    .3847353
                  16  |   .1054241   .1421642     0.74   0.458    -.1732126    .3840608
                  20  |    .096901   .1338726     0.72   0.469    -.1654845    .3592865
                  21  |   .1171754   .1456377     0.80   0.421    -.1682693    .4026201
                  22  |   .1193198   .1436028     0.83   0.406    -.1621366    .4007762
                  23  |   .1130113   .1361809     0.83   0.407    -.1538983    .3799209
                  24  |   .1015534   .1368475     0.74   0.458    -.1666627    .3697695
                  25  |   .0691632   .1362035     0.51   0.612    -.1977907    .3361171
                  26  |   .0950923   .1346076     0.71   0.480    -.1687338    .3589183
                  27  |   .1104639   .1349296     0.82   0.413    -.1539932    .3749211
                  28  |   .1217864   .1333804     0.91   0.361    -.1396343    .3832071
                  29  |   .1467443    .137611     1.07   0.286    -.1229683    .4164569
                  30  |   .1035668   .1359101     0.76   0.446    -.1628122    .3699458
                  31  |   .1227387   .1381892     0.89   0.374    -.1481071    .3935846
                  32  |   .0797544   .1376528     0.58   0.562    -.1900401    .3495488
                  33  |   .1599763   .1346461     1.19   0.235    -.1039252    .4238778
                  34  |   .1142864   .1349216     0.85   0.397     -.150155    .3787277
                  35  |   .1243864   .1334265     0.93   0.351    -.1371247    .3858976
                  36  |   .1020661   .1333878     0.77   0.444    -.1593692    .3635013
                  37  |   .1207218   .1339696     0.90   0.368    -.1418538    .3832974
                  38  |   .1149412    .133723     0.86   0.390     -.147151    .3770334
                  39  |    .127896   .1378308     0.93   0.353    -.1422475    .3980395
                  40  |   .1078956   .1383854     0.78   0.436    -.1633347     .379126
                  42  |   .1286032   .1361025     0.94   0.345    -.1381528    .3953593
                  44  |   .1374025   .1386262     0.99   0.322    -.1342999    .4091049
                  45  |   .1448689   .1384023     1.05   0.295    -.1263946    .4161324
                  47  |   .1238148   .1415603     0.87   0.382    -.1536384    .4012679
                  48  |   .0927969   .1347112     0.69   0.491    -.1712322    .3568261
                  49  |    .118587   .1333276     0.89   0.374    -.1427304    .3799043
                  50  |   .1348244   .1343344     1.00   0.316    -.1284663     .398115
                  51  |   .0960727   .1367427     0.70   0.482     -.171938    .3640835
                  52  |    .134633   .1397755     0.96   0.335     -.139322     .408588
                  53  |   .1279287   .1353409     0.95   0.345    -.1373346     .393192
                  54  |   .1291194   .1365165     0.95   0.344     -.138448    .3966869
                  55  |   .1350973   .1428312     0.95   0.344    -.1448468    .4150413
                  56  |   .1145981   .1347339     0.85   0.395    -.1494755    .3786717
                  57  |   .0809662   .1386333     0.58   0.559    -.1907501    .3526826
                  58  |   .1205281   .1344315     0.90   0.370    -.1429529     .384009
                  59  |   .1122717   .1350097     0.83   0.406    -.1523424    .3768858
                  60  |   .1165473   .1334945     0.87   0.383    -.1450971    .3781918
                  61  |   .1364377     .13797     0.99   0.323    -.1339784    .4068538
                  62  |   .1088646   .1346962     0.81   0.419    -.1551351    .3728643
                  63  |   .1499514   .1337062     1.12   0.262    -.1121079    .4120107
                  64  |   .1007712   .1384277     0.73   0.467    -.1705421    .3720846
                  67  |   .1357495   .1469927     0.92   0.356     -.152351    .4238499
                  70  |   .1323214   .1410163     0.94   0.348    -.1440656    .4087083
                  72  |   .0378711    .139473     0.27   0.786    -.2354909     .311233
                  73  |   .1430759   .1333791     1.07   0.283    -.1183424    .4044942
                  75  |   .0493117   .1415494     0.35   0.728    -.2281201    .3267435
                  78  |   .1249982   .1535171     0.81   0.416    -.1758898    .4258862
                  79  |    .141444   .1486306     0.95   0.341    -.1498666    .4327547
                  80  |   .0896478   .1353204     0.66   0.508    -.1755752    .3548709
                  82  |   .1409596   .1421393     0.99   0.321    -.1376283    .4195476
                  83  |   .1234516   .1535728     0.80   0.421    -.1775454    .4244487
                  87  |   .1095456   .1359661     0.81   0.420    -.1569431    .3760344
                  99  |   .1229992   .1628695     0.76   0.450     -.196219    .4422175
                      |
                _cons |  -.1400406   .1329942    -1.05   0.292    -.4007044    .1206232
----------------------+----------------------------------------------------------------
              sigma_u |          0
              sigma_e |  .19574748
                  rho |          0   (fraction of variance due to u_i)
---------------------------------------------------------------------------------------

. estimates store random

. hausman fixed random

                 ---- Coefficients ----
             |      (b)          (B)            (b-B)     sqrt(diag(V_b-V_B))
             |     fixed        random       Difference          S.E.
-------------+----------------------------------------------------------------
       UCOMP |    .0384001     .0391397       -.0007396        .0047006
      1.POST |    .0384032     .0387399       -.0003366        .0025516
POST#c.UCOMP |
          1  |   -.0362169    -.0287319       -.0074849        .0083665
   D_RET_win |    .0318201     .0341826       -.0023624        .0018121
   D_ROE_win |   -.3665595    -.3303528       -.0362066         .007233
D_logSALES.. |   -.0649946    -.0488908       -.0161038        .0088984
------------------------------------------------------------------------------
                           b = consistent under Ho and Ha; obtained from xtreg
            B = inconsistent under Ha, efficient under Ho; obtained from xtreg

    Test:  Ho:  difference in coefficients not systematic

                  chi2(6) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                          =       38.04
                Prob>chi2 =      0.0000

.

- robustified or clustered standard errors in -xtreg- do the same job, but you should invoke one option or the other explicitly (that is, -xtreg- does not automatically correct for heteroskedasticity and/or autocorrelation (please note that for a large N, small T panel dataset, as the one usually analyzed via -xtreg-, autocorrelation is a minor concern);
- in -regress- there's no way to correct for both heteroskedastcity and autocorrelation. However, if you have panel data and go -regress-, you should -cluster- your standard errors since you do not have independent observations;

This does result in a mixed bag for me, doesn't it? You point out that heteroscedasticity generally is of a bigger concern than autocorrelation given that N are large and T small (or does this only apply when using -xtreg-?), which should be the case for my model. -xtreg,fe- is hardly working for my analysis due to the omission of the industry fixed effects. Moreover, as I've just learned from the first part of your last reply -xtreg- should be suboptimal compared to -regress- as the F-test at the end of -xtreg,fe- lacks statistical significance. Using -regress- then, as you point out, calls for the -cluster- option, which would mean that I do not control for heteroscedasticity, right? Is this still optimal then? Or can I do a test to sort out the threat of heteroscedasticity?

- if the F-test value after -regress- with clustered standard errors does not appear, you can click on the hyperlink that appears in blue;

I did that and identified one cluster holding two observations one of which was nonzero. I dropped the observations relating to this cluster and ran the regression again. There is still no F-test value reported. Did I miss something? Is there anything else I can do to identify observations that might cause the missing F statistic? The reported degrees of freedom suggest there is no problem because the number of clusters-1 is far higher than the number of constrains -F(65, 944)-.

- for your last query there's usually little you can do, but looking for a different specification.

Unfortunately, I do not really get what you are implying. Did I understand you right before that my first -regress- model produces useful results despite the omitted -2005.fyear- due to collinearity? I thought this meant that I could use the results (when using the -cl- option instead of -r-), as you stated that the omitted -2005.fyear- is of no concern. In your last comment you wrote however that there is little I could do but to look for a different specification. What does this mean?

THANK YOU VERY MUCH IN ADVANCE!

Kind regards,
Roman

Comment

Ariel Karlinsky

Join Date: Jun 2015

Posts: 491
#6

19 Mar 2017, 04:02

Excuse me if this is redundant, but the question regarding the exclusion of industry fixed effects while firm fixed effects are present raises basic issues in regression models - Perhaps this is obvious, but my reading here points to me that it isn't.

Recall that for any variable to be present and estimated in a model, it needs to be distinct from other variables. in short, it needs to contain information that is not contained in any of the other variables (or combination thereof, but we'll disregard it for now). Essentially you need to make sure that no variable is completely determined by any other set of variables. Assume for example you have a survey of individuals. each individual is either married or single(married = 1 for married) and either has kids or doesn't have kids (kids =1 for has kids). further assume that in your sample all married individuals have kids and all single individuals do not have kids - and so, while you would have loved to estimate both the effects of kids and marital status on your outcome variable (for example income) - you cannot. why? because if you know that an individual is married, you know that he has kids. thus "Kids" contains no additional information not already present in the "Marital status" variable. The same logic works in the opposite direction - "Marital Status" contains no information not already present in the "Kids" variable.

Back to your specific model and question - firms are nested within industries - once you know what firm were talking about (e.g Statacorp) you also know which industry (e.g Software) it's in. Thus the "Industry" variable contains no information not already present in the "Firm" variable. This means you cannot estimate or "account" for both. you must choose - either account for industry or account for firm.
Also note that the situation here is different than my married-kids example where the relation is two-way. in your case, "Industry" does not contain any information not already present in "Firm", but "Firm" does contain information not already present in "Industry". this is also why a "Firm FE" model is supposedly better (a-priori) than an "Industry FE" model

On a side note, since your data is structured as "firm-year", firm FE are probably a "better model" than Industry FE. but that also depends on what exactly your'e modelling. if for example you also wish to estimate variables that are time-invariant within firms (for example if they're public or private) this cannot be done with firm FE. you might then use a random effects panel model with industry FE.

Hope this was clear.

P.S I always recommend this neat little presentation regarding panel models in stata from Torres:
https://www.princeton.edu/~otorres/Panel101.pdf

Last edited by Ariel Karlinsky; 19 Mar 2017, 04:04.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17742
#7

19 Mar 2017, 05:48

Roman:
all the points made by Ariel do deserve attention.
The results in your last post points you out to POLS (in both -xtreg- models there's no evidence of individual effects).
You have a quite huge number of observations, so heteroskedasticity won't be of real concern.
You should apply -cluster-ed standard errors in -regress- as you're dealing with non-independent observations.
The alst point in my previous reply was about omission due to collinearity: whenever it occurs, there's no other choice than sniffing out the culprits and look for another model specification.
However, in your example, the omission of -2005.fyear- because of collinearity is of no concern: hence, in my opinion, you can present those results.

Kind regards,
Carlo
(Stata 19.0)
Comment
Roman Vanderson

Join Date: Jan 2017

Posts: 20
#8

20 Mar 2017, 07:15

Dear Ariel, dear Carlo,

Thank you very much for your comments!

As Ariel keenly observed, I lack a decent fundamental statistics/econometrics education. While having attended statistics courses that imparted a basic theoretical understanding, I never got introduced to practical applications, let alone statistical software packages like Stata. So basically I am self-educated regarding what I am doing in my current research. I try to find and read as much as possible to find solutions on my own to the little obstacles that typically pop up when attempting to find academically valid solutions to a problem. This however sometimes translates into me being confused, at least temporarily, which leads to asking, well, basic questions. I am therefore very, very thankful if these get addressed in replies as this ensures a steep learning curve on my side. So thank you for your extended comment including a, as I found it, very illustrative example addressing redundancies to the eye of the more tenured Statalisters and statisticians. At this point, also let me emphasise how amazed I am by the great support and sheer interest and courtesy of the people in this forum!!! In this spirit, Ariel, I thank you very much for your example and the link to Torres' slides.

In a general model's sense, my problem basically is that I am interested in the effects of unobservable (to company outsiders) actions of CEOs on their compensation, i.e. detecting the informational advantage that the board of directors has – and uses to compensate the companies executives – before it becomes observable in (then future) firm performance. More precisely, I am interested in the effect that the implementation of SOX regulation might have on this relationship, which has been described in previous studies. So I am conducting research in the sense of an add-on, if you will, to existing findings. As I suspect – and was assumed in previous studies – that industry specific circumstances might affect the relationship in question, industry fixed effects are of primary importance; and also control for firm specific effects in favour of more generally valid results.

So if year and industry fixed effects are set, practical problems arise. As pointed out in my last post, I did run a hausman test to establish if I need a fixed effects model. Conducting a hausman however it is not possible without the omission of the industry fixed effects due to collinearity (as pointed out by Carlo already and further explained by Ariels example: there are several firms active in an industry). This leaves me with the more technical question whether or not I actually can trust the results of the hausman test, that suggests -fe-, in general. Anyhow, as I need fixed effects (for industry and year) in my model anyway and as the slides of Princeton's Torres (that Ariel was kind enough to link) suggest as well, one can use -reg- with factor variables included instead of -xtreg- to introduce fixed effects in a regression. As Carlo kindly pointed out that I need to make use of the -cluster- option instead of the -robust- one due to the non-independence of my observations (Carlo, thank you again very much for that information!!!) and mentioned that this would control for autocorrelation but not for heteroscedasticity as would the -robust- option, I was left rather puzzled as I did mean to control for heteroscedasticity as has been done in the literature I build my research on, which is based on a far more extensive sample.

I further checked for the notion that one cannot correct for both heteroskedastcity and autocorrelation when using -regress- and came across some findings that suggest differently... Which leaves me confused (hopefully only temporarily again) about which information is actually correct. As Stata is continuously developed further what I found might not be valid anymore. Does anyone have any clue about that?

Here is what I found:
Stata's User's manual (http://www.stata.com/manuals13/u20.pdf) states under section "20.21.2 Correlated errors: cluster–robust standard errors", starting on page 52, suggests that

The robust estimator of variance has one feature that the conventional estimator does not have: the ability to relax the assumption of independence of the observations.

With my self-educated smattering I translate that as: the -cluster- option is a variant of the -robust- option that introduces a relaxed independence assumption into the -robust- option.
This interpretation is further fuelled by some older posts on Statalist (http://www.statalist.org/forums/foru...nel-data-model) where also a draft of a paper is linked (http://fmwww.bc.edu/repec/bocode/x/xtscc_paper.pdf) that suggests (in the table on page 4) that the -cluster()- option used in -regress- as well as -xtreg- reports

SE estimates [that] are robust to disturbances being heteroscedastic and autocorrelated

So what is it then? Is it possible to report heteroscedastic-consistent standard errors when opting for -cluster()- in -reg- or not? Or did I miss something here (changes to version updates etc.)?

@ Carlo: Thank you very much for the clarification regarding the usefulness of my results!!! VERY APPRECIATED!
Further, thank you for you again for the new insights! I'll had a quick look on pooled OLS. Sounds as interesting as (practically) complicated compared to the "ordinary" OLS. What's the reasoning that I should use POLS and what to pool? By POST, years, or industry?

FOR YOUR HELP I THANK YOU VERY MUCH IN ADVANCE!

Kind regards,
Roman
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17742

20 Mar 2017, 07:35

Roman:
POLS is indeed pretty simple.
You should perform an OLS with standard errors clustered on -panelid- (i.e., the first variable included in -xtset-), since you do not have independent observations in a panel dataset.
As a closing-out remark, you're seemingly mixing up that -robust- and -clustered- standard errors do the same job under -xtreg-, whereas that feature does not hold under -regress-, as you can see from the following example:

Code:

. use "http://www.stata-press.com/data/r14/nlswork.dta", clear
(National Longitudinal Survey.  Young Women 14-26 years of age in 1968)

. reg ln_wage i.race, vce(robust)

Linear regression                               Number of obs     =     28,534
                                                F(2, 28531)       =     276.49
                                                Prob > F          =     0.0000
                                                R-squared         =     0.0187
                                                Root MSE          =     .47363

------------------------------------------------------------------------------
             |               Robust
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        race |
      black  |  -.1427862   .0061721   -23.13   0.000    -.1548837   -.1306887
      other  |    .080671   .0291848     2.76   0.006     .0234674    .1378747
             |
       _cons |   1.714338   .0033551   510.97   0.000     1.707762    1.720914
------------------------------------------------------------------------------

. reg ln_wage i.race, vce(cluster idcode )

Linear regression                               Number of obs     =     28,534
                                                F(2, 4710)        =      58.69
                                                Prob > F          =     0.0000
                                                R-squared         =     0.0187
                                                Root MSE          =     .47363

                             (Std. Err. adjusted for 4,711 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        race |
      black  |  -.1427862   .0133808   -10.67   0.000    -.1690188   -.1165536
      other  |    .080671   .0647742     1.25   0.213    -.0463166    .2076587
             |
       _cons |   1.714338   .0071195   240.80   0.000     1.700381    1.728296
------------------------------------------------------------------------------

. xtreg ln_wage i.race, vce(cluster idcode )

Random-effects GLS regression                   Number of obs     =     28,534
Group variable: idcode                          Number of groups  =      4,711

R-sq:                                           Obs per group:
     within  = 0.0000                                         min =          1
     between = 0.0198                                         avg =        6.1
     overall = 0.0186                                         max =         15

                                                Wald chi2(2)      =     102.23
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000

                             (Std. Err. adjusted for 4,711 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        race |
      black  |  -.1300382   .0131411    -9.90   0.000    -.1557943   -.1042821
      other  |   .1011474   .0665033     1.52   0.128    -.0291967    .2314915
             |
       _cons |   1.691756   .0069814   242.32   0.000     1.678073    1.705439
-------------+----------------------------------------------------------------
     sigma_u |  .38195681
     sigma_e |  .32028665
         rho |  .58714668   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. xtreg ln_wage i.race, vce(robust)

Random-effects GLS regression                   Number of obs     =     28,534
Group variable: idcode                          Number of groups  =      4,711

R-sq:                                           Obs per group:
     within  = 0.0000                                         min =          1
     between = 0.0198                                         avg =        6.1
     overall = 0.0186                                         max =         15

                                                Wald chi2(2)      =     102.23
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000

                             (Std. Err. adjusted for 4,711 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        race |
      black  |  -.1300382   .0131411    -9.90   0.000    -.1557943   -.1042821
      other  |   .1011474   .0665033     1.52   0.128    -.0291967    .2314915
             |
       _cons |   1.691756   .0069814   242.32   0.000     1.678073    1.705439
-------------+----------------------------------------------------------------
     sigma_u |  .38195681
     sigma_e |  .32028665
         rho |  .58714668   (fraction of variance due to u_i)
------------------------------------------------------------------------------

.

Kind regards,
Carlo
(Stata 19.0)

Comment

Roman Vanderson

Join Date: Jan 2017

Posts: 20
#10

22 Mar 2017, 03:09

Dear Carlo,

Thank you very much again for your comprehensive help and the many insights!!!

I consulted some pertinent readings and realised that I may have mistook the terminology of "pooled OLS" for OLS based on data pooled beforehand.

Doing as you suggested I'll base further testing on the first -regress- model posted above (the pooled OLS).

Again: thank you very much!

Kind regards,
Roman
Comment

Announcement

xtreg, fe vs. factor variable inclusion: (Factor) Variables omitted because of collinearity (dummy variable trap?)

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment