Industry/Year Fixed Effects (Panel Data)

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17711
#16

13 Oct 2020, 03:48

Wojciech:
as per Authors' explanation, it seems that they stick with pooled OLS, no matter the set of rgeressors they used/added.
Now it's up to you to go -xtreg,fe- (as I would do) or pooled -regress- (to mimick Authors' approach).

Kind regards,
Carlo
(Stata 19.0)
Comment
Wojciech Gulkowski

Join Date: Sep 2020

Posts: 22
#17

13 Oct 2020, 05:49

Thank you Carlo.

I plan to use -xtreg,fe- specification wherever I can, that is in specifications where I don't include industry fixed effects. However, in order to mimic the researchers approach that is taking into consideration industry fixed effects I should use pooled OLS -regress- command and include i.industry_numeric, correct (and not -xtreg-, re as I mentioned in my previous post)? Since I cannot use both -xtreg,fe- and i.industry_numeric due to collinearity issues..
Comment
Wojciech Gulkowski

Join Date: Sep 2020

Posts: 22
#18

13 Oct 2020, 05:57

By the way would the two be equivalent to each other:

Code:

xtreg ln_cash lit_risk size lev mtb nwc rd growth cf cf_vol_5y industry_sigma acq capex ndi nei div, fe vce(cluster id)

and

Code:

regress ln_cash lit_risk_L1 size lev mtb nwc rd growth cf cf_vol_5y industry_sigma acq capex ndi nei div i.id, vce(cluster id)

The first is a fixed effects panel regression (-id- variable set to a firm level) and the second one pooled OLS with firm fixed effects. So are they equivalent?

Thank you.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17711
#19

13 Oct 2020, 05:57

Wojciech:
if one of the aims of your research is to mimick what Authors did in their paper, you may want to consider pooled OLS, too (as they actually did).
That said, I recommend you to discuss each and every methodological choice with your mentor/teacher/professor/supervisor, just to avoid disappointing/emabarassing situations during the last mile of your research.

Kind regards,
Carlo
(Stata 19.0)
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17711

#20

13 Oct 2020, 06:03

Wojciech:
the two codes will yeld equivalent results as far as the sample estimates of the shared regressors is concerned. Differences are expected for standard errors (and related stuff) and for constant (but see: https://www.stata.com/support/faqs/s...fects-model/):

Code:

. use "https://www.stata-press.com/data/r16/nlswork.dta"
(National Longitudinal Survey.  Young Women 14-26 years of age in 1968)

. xtreg ln_wage c.age##c.age if idcode<=3, fe vce(cluster id)

Fixed-effects (within) regression               Number of obs     =         39
Group variable: idcode                          Number of groups  =          3

R-sq:                                           Obs per group:
     within  = 0.6382                                         min =         12
     between = 0.8744                                         avg =       13.0
     overall = 0.2765                                         max =         15

                                                F(2,2)            =       3.83
corr(u_i, Xb)  = -0.2473                        Prob > F          =     0.2070

                                 (Std. Err. adjusted for 3 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .2512762   .1007559     2.49   0.130    -.1822416     .684794
             |
 c.age#c.age |  -.0037603   .0015163    -2.48   0.131    -.0102844    .0027638
             |
       _cons |  -2.189815   1.575348    -1.39   0.299    -8.967992    4.588361
-------------+----------------------------------------------------------------
     sigma_u |  .31366066
     sigma_e |  .19867104
         rho |  .71367959   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. reg ln_wage c.age##c.age i.idcode if idcode<=3,vce(cluster id)

Linear regression                               Number of obs     =         39
                                                F(1, 2)           =          .
                                                Prob > F          =          .
                                                R-squared         =     0.7407
                                                Root MSE          =     .19867

                                 (Std. Err. adjusted for 3 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .2512762    .103677     2.42   0.136    -.1948099    .6973623
             |
 c.age#c.age |  -.0037603   .0015603    -2.41   0.138    -.0104736     .002953
             |
      idcode |
          2  |  -.4231615   .0288023   -14.69   0.005    -.5470877   -.2992353
          3  |  -.6126416   .0625166    -9.80   0.010    -.8816288   -.3436544
             |
       _cons |   -1.82398   1.588179    -1.15   0.370    -8.657361      5.0094
------------------------------------------------------------------------------

.

Kind regards,
Carlo
(Stata 19.0)

Comment

Wojciech Gulkowski

Join Date: Sep 2020

Posts: 22
#21

13 Oct 2020, 06:11

Ok so if they yield the same results in terms of coefficients I will go with both:

a) -regress- Pooled OLS with i.industry_numeric (where I look to include industry fixed effects)

and

b) -xtreg,fe- Fixed effects regression (where I look to include firm level fixed effects imposed by fe specification)

Thank you for all help Carlo.
Comment

Wojciech Gulkowski

Join Date: Sep 2020
Posts: 22

#22

13 Oct 2020, 06:24

Actually, I just checked the two specifications and for some reason they have different coefficients.

Code:

. xtreg ln_cash lit_risk size lev mtb nwc rd growth cf cf_vol_5y industry_sigma acq capex ndi nei div i.year, fe vce(cluster id)

Fixed-effects (within) regression               Number of obs     =      3,083
Group variable: id                              Number of groups  =        351

R-sq:                                           Obs per group:
     within  = 0.2696                                         min =          1
     between = 0.5484                                         avg =        8.8
     overall = 0.5164                                         max =          9

                                                F(23,350)         =      23.08
corr(u_i, Xb)  = 0.1569                         Prob > F          =     0.0000

                                     (Std. Err. adjusted for 351 clusters in id)
--------------------------------------------------------------------------------
               |               Robust
       ln_cash |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
---------------+----------------------------------------------------------------
      lit_risk |    .001018   .0005239     1.94   0.053    -.0000123    .0020484
          size |   .7397585   .0579766    12.76   0.000     .6257322    .8537848
           lev |  -.6847657   .2475307    -2.77   0.006      -1.1716    -.197931
           mtb |   .0549462   .0131649     4.17   0.000     .0290541    .0808384
           nwc |  -1.357777   .2877874    -4.72   0.000    -1.923787   -.7917672
            rd |  -.2639219   .1092408    -2.42   0.016    -.4787728   -.0490709
        growth |  -.2644579   .1528391    -1.73   0.084    -.5650565    .0361408
            cf |  -.0113674    .286946    -0.04   0.968    -.5757228    .5529879
     cf_vol_5y |   1.987609   .7245799     2.74   0.006     .5625311    3.412688
industry_sigma |   6.153141   2.241524     2.75   0.006     1.744591    10.56169
           acq |  -2.433709   .2453183    -9.92   0.000    -2.916193   -1.951226
         capex |  -3.205605   .8809773    -3.64   0.000     -4.93828    -1.47293
           ndi |   1.518168   .2765364     5.49   0.000     .9742856     2.06205
           nei |   .7939962   .2483272     3.20   0.002     .3055949    1.282398
           div |   .0180821   .0712251     0.25   0.800     -.122001    .1581652
               |
          year |
         2011  |  -.0442112   .0304748    -1.45   0.148     -.104148    .0157256
         2012  |   .0400003   .0355404     1.13   0.261    -.0298994       .1099
         2013  |   .0454666   .0405875     1.12   0.263    -.0343595    .1252927
         2014  |   .0194234   .0475414     0.41   0.683    -.0740794    .1129263
         2015  |  -.0739048   .0506218    -1.46   0.145    -.1734661    .0256564
         2016  |  -.0041762    .051068    -0.08   0.935     -.104615    .0962627
         2017  |  -.0587983   .0542472    -1.08   0.279    -.1654898    .0478933
         2018  |  -.2163473   .0594791    -3.64   0.000    -.3333286    -.099366
               |
         _cons |   .0836588   .5586551     0.15   0.881    -1.015084    1.182402
---------------+----------------------------------------------------------------
       sigma_u |   .9933031
       sigma_e |  .48230923
           rho |  .80921242   (fraction of variance due to u_i)
--------------------------------------------------------------------------------

and the pooled OLS:

Code:

. regress ln_cash lit_risk_L1 size lev mtb nwc rd growth cf cf_vol_5y industry_sigma acq capex ndi nei div i.id i.year,vce(cluster id)

Linear regression                               Number of obs     =      3,105
                                                F(22, 349)        =          .
                                                Prob > F          =          .
                                                R-squared         =     0.9091
                                                Root MSE          =     .49522

                                     (Std. Err. adjusted for 350 clusters in id)
--------------------------------------------------------------------------------
               |               Robust
       ln_cash |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
---------------+----------------------------------------------------------------
   lit_risk_L1 |    .001719   .0004113     4.18   0.000       .00091    .0025279
          size |   .7280128   .0733743     9.92   0.000     .5837013    .8723244
           lev |   -.756417    .290493    -2.60   0.010    -1.327754   -.1850799
           mtb |   .0481119    .013184     3.65   0.000     .0221819     .074042
           nwc |  -1.159551   .3119168    -3.72   0.000    -1.773025   -.5460783
            rd |  -.0306303   .1361729    -0.22   0.822     -.298453    .2371925
        growth |  -.3528492   .1621491    -2.18   0.030    -.6717616   -.0339367
            cf |   .1878638   .3069827     0.61   0.541     -.415905    .7916326
     cf_vol_5y |   1.735107   .7032533     2.47   0.014     .3519591    3.118255
industry_sigma |   5.907335     2.5528     2.31   0.021     .8865268    10.92814
           acq |  -2.463465   .2526048    -9.75   0.000    -2.960284   -1.966646
         capex |   -3.41898   .9644317    -3.55   0.000    -5.315809   -1.522151
           ndi |   1.593964   .3013606     5.29   0.000     1.001253    2.186676
           nei |   .7494894   .2514164     2.98   0.003     .2550076    1.243971
           div |   .0149903   .0917401     0.16   0.870    -.1654428    .1954233
               |
            id |
            2  |  -.0697347   .1051029    -0.66   0.507    -.2764495    .1369801
            3  |   .0404617   .2148498     0.19   0.851    -.3821017     .463025
            4  |  -1.171091   .1103501   -10.61   0.000    -1.388126   -.9540561
            5  |  -.2754307   .1346967    -2.04   0.042      -.54035   -.0105113
            6  |  -.1607045   .1041401    -1.54   0.124    -.3655257    .0441167
            7  |  -.0600145   .2251348    -0.27   0.790    -.5028061    .3827771
            8  |  -1.149854   .1657165    -6.94   0.000    -1.475782    -.823925
            9  |  -1.417087   .1810527    -7.83   0.000    -1.773179   -1.060996
           10  |    1.19624   .1058324    11.30   0.000     .9880902    1.404389
           11  |   .0839063   .1631276     0.51   0.607    -.2369306    .4047431
           12  |  -2.146383   .2553015    -8.41   0.000    -2.648506    -1.64426

id variable goes till 350..

Did I overlook something?

Last edited by Wojciech Gulkowski; 13 Oct 2020, 06:30.

Comment

Eric de Souza

Join Date: Mar 2014

Posts: 587
#23

13 Oct 2020, 06:41

Your variables are not the same:
xtreg ln_cash lit_risk ....
regress ln_cash lit_risk_L1 ...
Comment
Wojciech Gulkowski

Join Date: Sep 2020

Posts: 22
#24

13 Oct 2020, 06:58

You are right Eric, simple copy and paste mistake on my side. I checked again and indeed the coefficients in both cases are the same. Thanks!
Comment
Wojciech Gulkowski

Join Date: Sep 2020

Posts: 22
#25

13 Oct 2020, 07:11

I'm a little bit concerned by high R^2 of my -regress- model (0.91) as compared with the fixed effects specification (0.52). It seems suspiciously high. Also I don't quite understand why these two are reported as dots in -regress- specification, does it mean 0?

Code:

F(22, 349) = . Prob > F = .
Comment
Wojciech Gulkowski

Join Date: Sep 2020

Posts: 22
#26

13 Oct 2020, 07:31

Actually when I use pooled OLS -regress- command with any other specification than i.id my main variable of interest loses all significance. Would it be possible and make any sense to run -xtreg,re- regression where I want to account for industry fixed effects (as I cannot include them in -xtreg,fe- specification)? That way my litigation risk variable would stay relevant. I know that this sounds desperate, but would that have any logical explanation to use such specification? In the end I'm adding industry fixed effects to a random effects regression so is it still random effects or can be considered fixed effects regression?

And for other regressions where I consider firm-level and time fixed effects only I would use -xtreg,fe- in line with my -xtoverid- recommendation.

Does that make sense? Sorry for writing so many posts but I still cannot make up my mind as to what regression output to include in my paper (and don't have time to wait for my tutor's comments which would say to do what I think is right anyway..)

Last edited by Wojciech Gulkowski; 13 Oct 2020, 07:49.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17711
#27

13 Oct 2020, 07:54

Wojciech:
#22: the difference in the number of observations gives you a first warning concerning the difference in the two codes (as Eric helpfully pointed out);
# 25: set aside what above, you should at R2-between when dealing with -xtreg,re-;
For dots in F test see help j_robustsingular.
#26: if -xtoverid- outcome points you toward -fe- specification, -xtreg,re- is less efficient and any statistical significance you can get may well be fictitious.

Kind regards,
Carlo
(Stata 19.0)
Comment
Wojciech Gulkowski

Join Date: Sep 2020

Posts: 22
#28

13 Oct 2020, 09:55

Thank you. I checked -help j_robustsingular- and the problem is probably due to as many predictors (i.id) as clusters that I have. But that said, can I still use the results of this model or are they somehow not significant? Looking at variables themselves they are significant..
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17711
#29

13 Oct 2020, 10:20

Woiciech:
keep the model as it is and explain why F-test is missing (if you want to).
All in all, the role of F-test is checking whether your predictors are jointly statistical significant (not a bid deal, indeed).

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Wojciech Gulkowski

Join Date: Sep 2020

Posts: 22
#30

15 Oct 2020, 06:43

Hi Carlo,

Just to let you know, I decided to keep four specifications for my model and compare results of each of them:

(1) Fixed effects regression with firm fixed effects:

Code:

xtreg ln_cash lit_risk size lev mtb nwc rd growth cf cf_vol_5y industry_sigma acq capex ndi nei div,fe vce(cluster id)

(2) Fixed effects regression with year and firm fixed effects:

Code:

xtreg ln_cash lit_risk size lev mtb nwc rd growth cf cf_vol_5y industry_sigma acq capex ndi nei div i.year,fe vce(cluster id))

(3) Random effects regression with year and industry fixed effects:

Code:

xtreg ln_cash lit_risk size lev mtb nwc rd growth cf cf_vol_5y industry_sigma acq capex ndi nei div i.year i.industry_numeric, re vce(cluster id)

(4) Pooled OLS regression with year and industry fixed effects

Code:

regress ln_cash lit_risk size lev mtb nwc rd growth cf cf_vol_5y industry_sigma acq capex ndi nei div i.year i.industry_numeric, vce(cluster id)

-xtoverid- command suggests I should be using FE effects so two first of these specifications are of this kind. However I do have some doubts if it makes any sense to use specification (3). It uses random effects with dummies that create fixed effects. In the end I'm not sure if such regression is more of a fixed effects model or random effects model? And also what are the implications of using dummies to create fixed effects in random effects specification? I asked this before but perhaps I didn't stress that issue enough in my previous posts.

Thank you.
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment