Industry Fixed Effects

Nikos Tsileponis

Join Date: Jul 2014

Posts: 72
#1

Industry Fixed Effects

29 Apr 2015, 06:07

Dear Statalisters,

I am reading a published paper in European Accounting Review (Vol.24 Issue 1 p.63-93) and I can't understand something. Let's assume for simplicity that y is the dependent variable, X is the vector of independent variables, id is the company identifier and IND is a set of industry dummies.

There is no need to mention all the details. Very briefly: In one regression table the authors say that they use Industry Fixed Effects and then in the notes of the table they mention that "All standard errors are clustered at a company level".

Is this possible? Is there any chance this is a mistake?

My second question is a bit more general: These authors present a few more tables. In one of them (Linear regression of y on X), they say that they use Industry dummies and cluster standard errors by industry. This makes more sense to me. Does this mean that they have run the following regression:

Code:

regress y X IND, cluster(industry_id)

In another table they say that they use Industry Fixed Effects (not industry dummies) and they cluster by industry again. Is this the regression they might have run?

Code:

xtreg y X, i(IND) fe

I have read several econometrics books. Still, it would be highly appreciated if someone could summarize 1-2 key points about the differences between using Industry dummies and Industry Fixed Effects (which I guess are similar to the differences between using Firm dummies and Firm Fixed Effects).

Thank you all in advance.

Best regards,
Nikos
Tags: None
Charlie Joyez

Join Date: Dec 2014

Posts: 421
#2

29 Apr 2015, 06:42

To answer the first question, if they have multiple observations per firm (and a correct firm_ID) they could have clustered their standards errors at the company level, using -,cluster(firm_ID)-, even though they add an industry fixed effect.

For you second question :

they use Industry dummies and cluster standard errors by industry.

This seems to more like

Code:

reg y X i.industry, cluster(industry)

The use of factor variable (i.var) will add industry dummies, while your suggested code add the "IND" value in the regression (and not a dummy).

Concerning the last code, it depends on whether they have panel-like data or not. I doubt it because if they have firm-level data (as it seems to be cf q.1), they couldn't declare a panel over the industry dimension, since all firms in a given industry will appear as repeated observation within panel..

So if the data is not industry-panel declared, they didn't use the -xtreg ,fe- command.

Hope this helps,
Charlie
Comment
Nikos Tsileponis

Join Date: Jul 2014

Posts: 72
#3

29 Apr 2015, 06:59

Hi Charlie,

Thanks a lot for your response.

Regarding my first question, how can we add industry fixed effects and cluster standard errors at the company level? (since this is what the authors say they do).

You mentioned the

Code:

-,cluster(firm_ID)

which is the clustering at the company level. How do we add industry fixed effects?

For example:

Code:

reg y X i.industry, cluster(firm_ID)

This is clustering at a company level using also industry dummies. This is not industry fixed effects. Correct?

Thank you once again.
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2473
#4

29 Apr 2015, 07:00

Hi Nikos, In addition to what Charlie said, it is technically possible to declare the panel without a time dimension in Stata, although there are no intuitive examples of why would you like to do so.
What you will find when estimating both models is that they have the same point estimates, but the clustered errors are slighly different. This is because there are different assumptions regarding the "panel" id when you run -xtreg,fe- than when you run just reg or even -areg-. Basically when you run it as dummies (reg or areg) you assume that the number of distinct groups is fixed as the number of total observations increase, while in xtreg, the number of distinct groups is assumed to increase as the sample increases.
Hope this helps.
Fernando
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2179
#5

29 Apr 2015, 07:03

There's no difference between including industry dummy variables and using industry fixed effects. They produce numerically identical results. Your final command does include industry fixed effects and clusters at the firm level (because, I trust, it is firm-level panel data).
2 likes
Comment
Charlie Joyez

Join Date: Dec 2014

Posts: 421
#6

29 Apr 2015, 07:32

As Jeff Wooldridge said, you'll reach the same results when adding fixed effects or dummy variables, this is why often authors talk about fixed effects when they actually add dummies, and why the terminology might change in the same article while the model remained the same.

However, I'd like to add that a small difference remains between (in Stata) the use of dummies or the panel-declared fixed effects (-,fe- option). The first will add some (and sometimes many) dummy variables that will impact the number of freedom degrees (and might rise an issue especially if you have a small sample and a very detailed industry classification). The latter would compute mean-difference and don't add new explicative variables.

So they will both reach the same results (coefficients and SE for remaining variables), but this slight difference is always good to know.

Charlie
1 like
Comment
Nikos Tsileponis

Join Date: Jul 2014

Posts: 72
#7

29 Apr 2015, 07:45

Dear all, thank you so much for your help. Charlie I appreciate your help a lot.

Prof. Wooldridge thank you for your clarification. I am a big fan of your books Your book "Introductory Econometrics: a modern approach" was the main reason why I started to enjoy econometrics

Best regards,
Nikos
Comment

Sinem Ates

Join Date: Mar 2018
Posts: 83

22 Feb 2020, 08:21

Dear All,

I have a problem regarding with industry dummies. I have used pooled OLS with year and industry dummies, random with year and industry dummies and fixed effects with year dummies to estimate my regression model. However the coefficient of one variable (ESGSCORE which measures the sustainability performances of companies) is negative according to the results of pooled ols if I include industry dummies (i.ICBIC), while it is positive according to the random&fixed effects. When I do not add industry dummies into pooled ols regression model, coefficient of ESGSCORE is positive as in random&fixed effects models. What could be the reason for this?

Code:

 regress TOBINSQ_w ESGSCORE SIZE_w LEV_w ROA_w i.YEAR i.ICBIC

      Source |       SS           df       MS      Number of obs   =     3,986
-------------+----------------------------------   F(23, 3962)     =    271.98
       Model |  2548.47506        23  110.803264   Prob > F        =    0.0000
    Residual |  1614.10965     3,962   .40739769   R-squared       =    0.6122
-------------+----------------------------------   Adj R-squared   =    0.6100
       Total |  4162.58471     3,985  1.04456329   Root MSE        =    .63828

------------------------------------------------------------------------------
   TOBINSQ_w |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    ESGSCORE |  -.0005098   .0006432    -0.79   0.428    -.0017709    .0007513
      SIZE_w |  -.0552093   .0087163    -6.33   0.000    -.0722981   -.0381205
       LEV_w |   .3086804   .0651737     4.74   0.000     .1809033    .4364575
       ROA_w |     .09807   .0021317    46.01   0.000     .0938907    .1022493
             |
        YEAR |
       2010  |  -.0284937   .0618738    -0.46   0.645    -.1498011    .0928137
       2011  |  -.2637422   .0602011    -4.38   0.000    -.3817702   -.1457141
       2012  |  -.2413311   .0588244    -4.10   0.000      -.35666   -.1260021
       2013  |  -.2664816   .0587351    -4.54   0.000    -.3816355   -.1513277
       2014  |  -.2207826   .0585351    -3.77   0.000    -.3355444   -.1060208
       2015  |    -.21203   .0586619    -3.61   0.000    -.3270404   -.0970196
       2016  |  -.2403001   .0587904    -4.09   0.000    -.3555623   -.1250378
       2017  |  -.0814085    .056275    -1.45   0.148    -.1917391    .0289221
       2018  |  -.2206457   .0570185    -3.87   0.000     -.332434   -.1088574
             |
       ICBIC |
         15  |  -.7636768    .073798   -10.35   0.000    -.9083625   -.6189911
         20  |  -.1219604    .071948    -1.70   0.090     -.263019    .0190983
         30  |  -.7655233   .0673943   -11.36   0.000     -.897654   -.6333925
         35  |    -1.1565   .0721007   -16.04   0.000    -1.297858   -1.015142
         40  |  -.4806975   .0656346    -7.32   0.000    -.6093783   -.3520166
         45  |   .1279281    .067607     1.89   0.059    -.0046197     .260476
         50  |  -.7534711   .0668297   -11.27   0.000    -.8844949   -.6224473
         55  |  -.8230724   .0650258   -12.66   0.000    -.9505595   -.6955853
         60  |  -.9553066   .0707333   -13.51   0.000    -1.093984   -.8166296
         65  |  -1.026664   .0717044   -14.32   0.000    -1.167245   -.8860829
             |
       _cons |   2.468769   .1543965    15.99   0.000     2.166065    2.771473
------------------------------------------------------------------------------

Code:

 xtreg TOBINSQ_w ESGSCORE SIZE_w LEV_w ROA_w i.YEAR i.ICBIC, re

Random-effects GLS regression                   Number of obs     =      3,986
Group variable: ID                              Number of groups  =        707

R-sq:                                           Obs per group:
     within  = 0.2455                                         min =          1
     between = 0.5636                                         avg =        5.6
     overall = 0.5645                                         max =         10

                                                Wald chi2(23)     =    2107.62
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000

------------------------------------------------------------------------------
   TOBINSQ_w |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    ESGSCORE |   .0011054   .0007831     1.41   0.158    -.0004294    .0026402
      SIZE_w |  -.1438746    .014823    -9.71   0.000    -.1729271   -.1148221
       LEV_w |    .041574   .0773312     0.54   0.591    -.1099923    .1931403
       ROA_w |   .0528969   .0019502    27.12   0.000     .0490746    .0567191
             |
        YEAR |
       2010  |   .0307574   .0371511     0.83   0.408    -.0420574    .1035722
       2011  |  -.1953875     .03642    -5.36   0.000    -.2667694   -.1240056
       2012  |  -.1723873   .0358601    -4.81   0.000    -.2426717   -.1021028
       2013  |  -.2002458   .0357924    -5.59   0.000    -.2703976   -.1300939
       2014  |  -.1815934   .0359651    -5.05   0.000    -.2520836   -.1111031
       2015  |   -.179665   .0363321    -4.95   0.000    -.2508746   -.1084554
       2016  |  -.2180321   .0370304    -5.89   0.000    -.2906103   -.1454538
       2017  |   -.124682   .0365058    -3.42   0.001     -.196232    -.053132
       2018  |  -.3060961   .0378728    -8.08   0.000    -.3803255   -.2318667
             |
       ICBIC |
         15  |  -.7782788   .1533132    -5.08   0.000    -1.078767   -.4777904
         20  |  -.1648175   .1412395    -1.17   0.243    -.4416417    .1120068
         30  |  -.8442655   .1292618    -6.53   0.000    -1.097614   -.5909171
         35  |  -1.307719   .1440673    -9.08   0.000    -1.590086   -1.025353
         40  |   -.537858   .1277383    -4.21   0.000    -.7882205   -.2874955
         45  |   .0668956   .1360108     0.49   0.623    -.1996807    .3334719
         50  |  -.9667212   .1308506    -7.39   0.000    -1.223184   -.7102587
         55  |  -1.016719   .1303984    -7.80   0.000    -1.272295   -.7611431
         60  |  -1.000838   .1510048    -6.63   0.000    -1.296802   -.7048744
         65  |  -1.214742   .1459771    -8.32   0.000    -1.500852   -.9286325
             |
       _cons |    4.37558   .2390133    18.31   0.000     3.907123    4.844038
-------------+----------------------------------------------------------------
     sigma_u |    .556966
     sigma_e |  .36229183
         rho |  .70268328   (fraction of variance due to u_i)
------------------------------------------------------------------------------

Code:

. xtreg TOBINSQ_w ESGSCORE SIZE_w LEV_w ROA_w i.YEAR, fe

Fixed-effects (within) regression               Number of obs     =      3,986
Group variable: ID                              Number of groups  =        707

R-sq:                                           Obs per group:
     within  = 0.2539                                         min =          1
     between = 0.3739                                         avg =        5.6
     overall = 0.3972                                         max =         10

                                                F(13,3266)        =      85.48
corr(u_i, Xb)  = 0.1955                         Prob > F          =     0.0000

------------------------------------------------------------------------------
   TOBINSQ_w |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    ESGSCORE |   .0036197   .0009008     4.02   0.000     .0018536    .0053859
      SIZE_w |  -.2211096   .0230067    -9.61   0.000    -.2662187   -.1760005
       LEV_w |   .1910869   .0868249     2.20   0.028       .02085    .3613237
       ROA_w |   .0428304    .002014    21.27   0.000     .0388815    .0467792
             |
        YEAR |
       2010  |   .0472331   .0363817     1.30   0.194    -.0241002    .1185663
       2011  |  -.1778433   .0359247    -4.95   0.000    -.2482806    -.107406
       2012  |  -.1565383   .0356851    -4.39   0.000    -.2265057   -.0865709
       2013  |  -.1935435   .0354278    -5.46   0.000    -.2630065   -.1240806
       2014  |  -.1776648   .0359571    -4.94   0.000    -.2481655    -.107164
       2015  |  -.1816331   .0365136    -4.97   0.000    -.2532251   -.1100412
       2016  |  -.2189484   .0378556    -5.78   0.000    -.2931716   -.1447252
       2017  |  -.1461956   .0377401    -3.87   0.000    -.2201923   -.0721989
       2018  |   -.341009   .0398129    -8.57   0.000    -.4190699   -.2629482
             |
       _cons |   4.647151   .3455167    13.45   0.000       3.9697    5.324602
-------------+----------------------------------------------------------------
     sigma_u |  .82895969
     sigma_e |  .36229183
         rho |  .83962533   (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(706, 3266) = 17.33                  Prob > F = 0.0000

.

Code:

regress TOBINSQ_w ESGSCORE SIZE_w LEV_w ROA_w i.YEAR

      Source |       SS           df       MS      Number of obs   =     3,986
-------------+----------------------------------   F(13, 3972)     =    319.56
       Model |  2127.96432        13  163.689563   Prob > F        =    0.0000
    Residual |  2034.62039     3,972  .512240784   R-squared       =    0.5112
-------------+----------------------------------   Adj R-squared   =    0.5096
       Total |  4162.58471     3,985  1.04456329   Root MSE        =    .71571

------------------------------------------------------------------------------
   TOBINSQ_w |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    ESGSCORE |    .000663   .0006957     0.95   0.341    -.0007011     .002027
      SIZE_w |  -.1196757   .0085584   -13.98   0.000     -.136455   -.1028963
       LEV_w |   .4939516   .0650886     7.59   0.000     .3663414    .6215618
       ROA_w |   .1137805   .0022739    50.04   0.000     .1093224    .1182387
             |
        YEAR |
       2010  |  -.0427501   .0693039    -0.62   0.537    -.1786247    .0931245
       2011  |  -.2900114   .0673885    -4.30   0.000    -.4221307   -.1578922
       2012  |  -.2604797   .0658164    -3.96   0.000    -.3895169   -.1314425
       2013  |  -.2825508   .0657052    -4.30   0.000    -.4113698   -.1537317
       2014  |  -.2310313   .0654817    -3.53   0.000    -.3594122   -.1026504
       2015  |   -.216109   .0656193    -3.29   0.001    -.3447596   -.0874583
       2016  |  -.2370597   .0657278    -3.61   0.000    -.3659232   -.1081963
       2017  |  -.0397674   .0628123    -0.63   0.527    -.1629148    .0833801
       2018  |  -.1591023   .0635577    -2.50   0.012     -.283711   -.0344936
             |
       _cons |   2.532379   .1455359    17.40   0.000     2.247047    2.817712
------------------------------------------------------------------------------

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17718
#9

22 Feb 2020, 15:27

Sinem:
you actually used different estimators: hence no wonder you obtained different results.
That said, you should check via -hausman- whether fixed or random effect specification fits your data better.
Since you're dealing with long N, small T panel dataset, I would consider -xtreg- as the first choice and switch to pooled OLS only if no evidence of panel-wise effect is proved.

Kind regards,
Carlo
(Stata 19.0)
Comment
Jon Hoefer

Join Date: Feb 2020

Posts: 47
#10

22 Feb 2020, 16:07

Originally posted by Jeff Wooldridge View Post

There's no difference between including industry dummy variables and using industry fixed effects. They produce numerically identical results. Your final command does include industry fixed effects and clusters at the firm level (because, I trust, it is firm-level panel data).

Dear Jeff, would you still make the statement nowadays? I am asking because on a similar issue (https://www.statalist.org/forums/for...an-issue/page2 #20) you mentioned the following to my question if pooled OLS applying dummies for industry and Fiscal Year would lead to the same results as FE:

1. Your ability to keep a time-invariant variable while adding fixed effects "by hand" is an illusion. You should not be doing this. There is only one true fixed effects estimator. xtreg, fe does it properly, as Carlo emphasized. By putting in the dummies "by hand" you are deluding yourself. Stata is simply dropping variables until there is no collinearity left. From an identification perspective, you cannot estimate coefficients on the time-constant variables.

Am I overlooking something (most definitely yes but I would highly appreciate to know what it is) or are the two statements at odds? Thank you in advance for your clarification
Comment
Minh Nhat

Join Date: Jul 2021

Posts: 1
#11

09 Jul 2021, 03:33

I have several questions, searched everywhere but still confused for me. Thank you in advance.
1. I have panel data at firm level, having some time-invariant variables. After running hausman test, RE is suggested. Because of auto and heteroskedasticity, I use vce(cluster). But my prof said that the later defeat the former. In many posts, I see comments suggest using vce(cluster, robust) to account for this problem. So I dont know how to deal with this issue. Could any one clarify the contradiction between the two.
2. Moreover, I was recommend to add industry fixed effect. As far as I understand from this post, there is no need because my regression has firm level variables.
3. What should be proper model in this case?
My previous version is: xtreg roa firmfundamentals L.CSR y20 L.CSR_y20 (interaction term) ,re vce(cluster sector)
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17718
#12

09 Jul 2021, 03:51

Minh:
1) as expected, due to demeaning, the -fe- estimator wipes out time-invariant variables. Not sure I got your prof.'s comment right, here. If you detect heteroskedasticity and/or autocorrelation, you shoud invoke non-default standard errors (-robust- or -vce(cluster)- options will do the very same job). The issue then is that -hausman- does not support non-default standard error (and you cannot go default for -hausman- and then impose non-default standard errors after the -hausman- verdict); hence, you should rely on the community-contributed module -xtoverid- that, being glorious but a bit old-fashioned, do nost support -fvvarlist- notation for categorical variables and interactions (you can try to prefix your code with -xi:- and see what happens).
Hence, I do not understand where the contradiction lies here: if you do have heteroskedasticity and/or autocorrelation and you go default standard errors, standard errors and related stuff will be unreliable.
2) Industry fixed effect will be wiped out by -fe- estimator if firms remain in the same industry during all the T dimension of your panel dataset; conversely, if you go -re-, a coefficient for this predictor, even if time-invariant, will be returned;
3) the idea of the right model is, unfortunately, only an idea. The best approach is to give a fair and true view of the data generating process under investigation; the literature in your research field can be a great support in this respect.l

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement