Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Industry Fixed Effects

    Dear Statalisters,

    I am reading a published paper in European Accounting Review (Vol.24 Issue 1 p.63-93) and I can't understand something. Let's assume for simplicity that y is the dependent variable, X is the vector of independent variables, id is the company identifier and IND is a set of industry dummies.

    There is no need to mention all the details. Very briefly: In one regression table the authors say that they use Industry Fixed Effects and then in the notes of the table they mention that "All standard errors are clustered at a company level".

    Is this possible? Is there any chance this is a mistake?

    My second question is a bit more general: These authors present a few more tables. In one of them (Linear regression of y on X), they say that they use Industry dummies and cluster standard errors by industry. This makes more sense to me. Does this mean that they have run the following regression:
    Code:
    regress y X IND, cluster(industry_id)
    In another table they say that they use Industry Fixed Effects (not industry dummies) and they cluster by industry again. Is this the regression they might have run?
    Code:
    xtreg y X, i(IND) fe
    I have read several econometrics books. Still, it would be highly appreciated if someone could summarize 1-2 key points about the differences between using Industry dummies and Industry Fixed Effects (which I guess are similar to the differences between using Firm dummies and Firm Fixed Effects).

    Thank you all in advance.

    Best regards,
    Nikos

  • #2
    To answer the first question, if they have multiple observations per firm (and a correct firm_ID) they could have clustered their standards errors at the company level, using -,cluster(firm_ID)-, even though they add an industry fixed effect.

    For you second question :
    they use Industry dummies and cluster standard errors by industry.
    This seems to more like
    Code:
    reg y X i.industry, cluster(industry)
    The use of factor variable (i.var) will add industry dummies, while your suggested code add the "IND" value in the regression (and not a dummy).

    Concerning the last code, it depends on whether they have panel-like data or not. I doubt it because if they have firm-level data (as it seems to be cf q.1), they couldn't declare a panel over the industry dimension, since all firms in a given industry will appear as repeated observation within panel..

    So if the data is not industry-panel declared, they didn't use the -xtreg ,fe- command.


    Hope this helps,
    Charlie

    Comment


    • #3
      Hi Charlie,

      Thanks a lot for your response.

      Regarding my first question, how can we add industry fixed effects and cluster standard errors at the company level? (since this is what the authors say they do).

      You mentioned the
      Code:
      -,cluster(firm_ID)
      which is the clustering at the company level. How do we add industry fixed effects?

      For example:
      Code:
       
       reg y X i.industry, cluster(firm_ID)
      This is clustering at a company level using also industry dummies. This is not industry fixed effects. Correct?

      Thank you once again.

      Comment


      • #4
        Hi Nikos, In addition to what Charlie said, it is technically possible to declare the panel without a time dimension in Stata, although there are no intuitive examples of why would you like to do so.
        What you will find when estimating both models is that they have the same point estimates, but the clustered errors are slighly different. This is because there are different assumptions regarding the "panel" id when you run -xtreg,fe- than when you run just reg or even -areg-. Basically when you run it as dummies (reg or areg) you assume that the number of distinct groups is fixed as the number of total observations increase, while in xtreg, the number of distinct groups is assumed to increase as the sample increases.
        Hope this helps.
        Fernando

        Comment


        • #5
          There's no difference between including industry dummy variables and using industry fixed effects. They produce numerically identical results. Your final command does include industry fixed effects and clusters at the firm level (because, I trust, it is firm-level panel data).

          Comment


          • #6
            As Jeff Wooldridge said, you'll reach the same results when adding fixed effects or dummy variables, this is why often authors talk about fixed effects when they actually add dummies, and why the terminology might change in the same article while the model remained the same.

            However, I'd like to add that a small difference remains between (in Stata) the use of dummies or the panel-declared fixed effects (-,fe- option). The first will add some (and sometimes many) dummy variables that will impact the number of freedom degrees (and might rise an issue especially if you have a small sample and a very detailed industry classification). The latter would compute mean-difference and don't add new explicative variables.

            So they will both reach the same results (coefficients and SE for remaining variables), but this slight difference is always good to know.

            Charlie

            Comment


            • #7
              Dear all, thank you so much for your help. Charlie I appreciate your help a lot.

              Prof. Wooldridge thank you for your clarification. I am a big fan of your books Your book "Introductory Econometrics: a modern approach" was the main reason why I started to enjoy econometrics

              Best regards,
              Nikos

              Comment


              • #8
                Dear All,

                I have a problem regarding with industry dummies. I have used pooled OLS with year and industry dummies, random with year and industry dummies and fixed effects with year dummies to estimate my regression model. However the coefficient of one variable (ESGSCORE which measures the sustainability performances of companies) is negative according to the results of pooled ols if I include industry dummies (i.ICBIC), while it is positive according to the random&fixed effects. When I do not add industry dummies into pooled ols regression model, coefficient of ESGSCORE is positive as in random&fixed effects models. What could be the reason for this?

                Code:
                 regress TOBINSQ_w ESGSCORE SIZE_w LEV_w ROA_w i.YEAR i.ICBIC
                
                      Source |       SS           df       MS      Number of obs   =     3,986
                -------------+----------------------------------   F(23, 3962)     =    271.98
                       Model |  2548.47506        23  110.803264   Prob > F        =    0.0000
                    Residual |  1614.10965     3,962   .40739769   R-squared       =    0.6122
                -------------+----------------------------------   Adj R-squared   =    0.6100
                       Total |  4162.58471     3,985  1.04456329   Root MSE        =    .63828
                
                ------------------------------------------------------------------------------
                   TOBINSQ_w |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                -------------+----------------------------------------------------------------
                    ESGSCORE |  -.0005098   .0006432    -0.79   0.428    -.0017709    .0007513
                      SIZE_w |  -.0552093   .0087163    -6.33   0.000    -.0722981   -.0381205
                       LEV_w |   .3086804   .0651737     4.74   0.000     .1809033    .4364575
                       ROA_w |     .09807   .0021317    46.01   0.000     .0938907    .1022493
                             |
                        YEAR |
                       2010  |  -.0284937   .0618738    -0.46   0.645    -.1498011    .0928137
                       2011  |  -.2637422   .0602011    -4.38   0.000    -.3817702   -.1457141
                       2012  |  -.2413311   .0588244    -4.10   0.000      -.35666   -.1260021
                       2013  |  -.2664816   .0587351    -4.54   0.000    -.3816355   -.1513277
                       2014  |  -.2207826   .0585351    -3.77   0.000    -.3355444   -.1060208
                       2015  |    -.21203   .0586619    -3.61   0.000    -.3270404   -.0970196
                       2016  |  -.2403001   .0587904    -4.09   0.000    -.3555623   -.1250378
                       2017  |  -.0814085    .056275    -1.45   0.148    -.1917391    .0289221
                       2018  |  -.2206457   .0570185    -3.87   0.000     -.332434   -.1088574
                             |
                       ICBIC |
                         15  |  -.7636768    .073798   -10.35   0.000    -.9083625   -.6189911
                         20  |  -.1219604    .071948    -1.70   0.090     -.263019    .0190983
                         30  |  -.7655233   .0673943   -11.36   0.000     -.897654   -.6333925
                         35  |    -1.1565   .0721007   -16.04   0.000    -1.297858   -1.015142
                         40  |  -.4806975   .0656346    -7.32   0.000    -.6093783   -.3520166
                         45  |   .1279281    .067607     1.89   0.059    -.0046197     .260476
                         50  |  -.7534711   .0668297   -11.27   0.000    -.8844949   -.6224473
                         55  |  -.8230724   .0650258   -12.66   0.000    -.9505595   -.6955853
                         60  |  -.9553066   .0707333   -13.51   0.000    -1.093984   -.8166296
                         65  |  -1.026664   .0717044   -14.32   0.000    -1.167245   -.8860829
                             |
                       _cons |   2.468769   .1543965    15.99   0.000     2.166065    2.771473
                ------------------------------------------------------------------------------
                Code:
                 xtreg TOBINSQ_w ESGSCORE SIZE_w LEV_w ROA_w i.YEAR i.ICBIC, re
                
                Random-effects GLS regression                   Number of obs     =      3,986
                Group variable: ID                              Number of groups  =        707
                
                R-sq:                                           Obs per group:
                     within  = 0.2455                                         min =          1
                     between = 0.5636                                         avg =        5.6
                     overall = 0.5645                                         max =         10
                
                                                                Wald chi2(23)     =    2107.62
                corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000
                
                ------------------------------------------------------------------------------
                   TOBINSQ_w |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                -------------+----------------------------------------------------------------
                    ESGSCORE |   .0011054   .0007831     1.41   0.158    -.0004294    .0026402
                      SIZE_w |  -.1438746    .014823    -9.71   0.000    -.1729271   -.1148221
                       LEV_w |    .041574   .0773312     0.54   0.591    -.1099923    .1931403
                       ROA_w |   .0528969   .0019502    27.12   0.000     .0490746    .0567191
                             |
                        YEAR |
                       2010  |   .0307574   .0371511     0.83   0.408    -.0420574    .1035722
                       2011  |  -.1953875     .03642    -5.36   0.000    -.2667694   -.1240056
                       2012  |  -.1723873   .0358601    -4.81   0.000    -.2426717   -.1021028
                       2013  |  -.2002458   .0357924    -5.59   0.000    -.2703976   -.1300939
                       2014  |  -.1815934   .0359651    -5.05   0.000    -.2520836   -.1111031
                       2015  |   -.179665   .0363321    -4.95   0.000    -.2508746   -.1084554
                       2016  |  -.2180321   .0370304    -5.89   0.000    -.2906103   -.1454538
                       2017  |   -.124682   .0365058    -3.42   0.001     -.196232    -.053132
                       2018  |  -.3060961   .0378728    -8.08   0.000    -.3803255   -.2318667
                             |
                       ICBIC |
                         15  |  -.7782788   .1533132    -5.08   0.000    -1.078767   -.4777904
                         20  |  -.1648175   .1412395    -1.17   0.243    -.4416417    .1120068
                         30  |  -.8442655   .1292618    -6.53   0.000    -1.097614   -.5909171
                         35  |  -1.307719   .1440673    -9.08   0.000    -1.590086   -1.025353
                         40  |   -.537858   .1277383    -4.21   0.000    -.7882205   -.2874955
                         45  |   .0668956   .1360108     0.49   0.623    -.1996807    .3334719
                         50  |  -.9667212   .1308506    -7.39   0.000    -1.223184   -.7102587
                         55  |  -1.016719   .1303984    -7.80   0.000    -1.272295   -.7611431
                         60  |  -1.000838   .1510048    -6.63   0.000    -1.296802   -.7048744
                         65  |  -1.214742   .1459771    -8.32   0.000    -1.500852   -.9286325
                             |
                       _cons |    4.37558   .2390133    18.31   0.000     3.907123    4.844038
                -------------+----------------------------------------------------------------
                     sigma_u |    .556966
                     sigma_e |  .36229183
                         rho |  .70268328   (fraction of variance due to u_i)
                ------------------------------------------------------------------------------
                Code:
                . xtreg TOBINSQ_w ESGSCORE SIZE_w LEV_w ROA_w i.YEAR, fe
                
                Fixed-effects (within) regression               Number of obs     =      3,986
                Group variable: ID                              Number of groups  =        707
                
                R-sq:                                           Obs per group:
                     within  = 0.2539                                         min =          1
                     between = 0.3739                                         avg =        5.6
                     overall = 0.3972                                         max =         10
                
                                                                F(13,3266)        =      85.48
                corr(u_i, Xb)  = 0.1955                         Prob > F          =     0.0000
                
                ------------------------------------------------------------------------------
                   TOBINSQ_w |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                -------------+----------------------------------------------------------------
                    ESGSCORE |   .0036197   .0009008     4.02   0.000     .0018536    .0053859
                      SIZE_w |  -.2211096   .0230067    -9.61   0.000    -.2662187   -.1760005
                       LEV_w |   .1910869   .0868249     2.20   0.028       .02085    .3613237
                       ROA_w |   .0428304    .002014    21.27   0.000     .0388815    .0467792
                             |
                        YEAR |
                       2010  |   .0472331   .0363817     1.30   0.194    -.0241002    .1185663
                       2011  |  -.1778433   .0359247    -4.95   0.000    -.2482806    -.107406
                       2012  |  -.1565383   .0356851    -4.39   0.000    -.2265057   -.0865709
                       2013  |  -.1935435   .0354278    -5.46   0.000    -.2630065   -.1240806
                       2014  |  -.1776648   .0359571    -4.94   0.000    -.2481655    -.107164
                       2015  |  -.1816331   .0365136    -4.97   0.000    -.2532251   -.1100412
                       2016  |  -.2189484   .0378556    -5.78   0.000    -.2931716   -.1447252
                       2017  |  -.1461956   .0377401    -3.87   0.000    -.2201923   -.0721989
                       2018  |   -.341009   .0398129    -8.57   0.000    -.4190699   -.2629482
                             |
                       _cons |   4.647151   .3455167    13.45   0.000       3.9697    5.324602
                -------------+----------------------------------------------------------------
                     sigma_u |  .82895969
                     sigma_e |  .36229183
                         rho |  .83962533   (fraction of variance due to u_i)
                ------------------------------------------------------------------------------
                F test that all u_i=0: F(706, 3266) = 17.33                  Prob > F = 0.0000
                
                .
                Code:
                regress TOBINSQ_w ESGSCORE SIZE_w LEV_w ROA_w i.YEAR
                
                      Source |       SS           df       MS      Number of obs   =     3,986
                -------------+----------------------------------   F(13, 3972)     =    319.56
                       Model |  2127.96432        13  163.689563   Prob > F        =    0.0000
                    Residual |  2034.62039     3,972  .512240784   R-squared       =    0.5112
                -------------+----------------------------------   Adj R-squared   =    0.5096
                       Total |  4162.58471     3,985  1.04456329   Root MSE        =    .71571
                
                ------------------------------------------------------------------------------
                   TOBINSQ_w |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                -------------+----------------------------------------------------------------
                    ESGSCORE |    .000663   .0006957     0.95   0.341    -.0007011     .002027
                      SIZE_w |  -.1196757   .0085584   -13.98   0.000     -.136455   -.1028963
                       LEV_w |   .4939516   .0650886     7.59   0.000     .3663414    .6215618
                       ROA_w |   .1137805   .0022739    50.04   0.000     .1093224    .1182387
                             |
                        YEAR |
                       2010  |  -.0427501   .0693039    -0.62   0.537    -.1786247    .0931245
                       2011  |  -.2900114   .0673885    -4.30   0.000    -.4221307   -.1578922
                       2012  |  -.2604797   .0658164    -3.96   0.000    -.3895169   -.1314425
                       2013  |  -.2825508   .0657052    -4.30   0.000    -.4113698   -.1537317
                       2014  |  -.2310313   .0654817    -3.53   0.000    -.3594122   -.1026504
                       2015  |   -.216109   .0656193    -3.29   0.001    -.3447596   -.0874583
                       2016  |  -.2370597   .0657278    -3.61   0.000    -.3659232   -.1081963
                       2017  |  -.0397674   .0628123    -0.63   0.527    -.1629148    .0833801
                       2018  |  -.1591023   .0635577    -2.50   0.012     -.283711   -.0344936
                             |
                       _cons |   2.532379   .1455359    17.40   0.000     2.247047    2.817712
                ------------------------------------------------------------------------------

                Comment


                • #9
                  Sinem:
                  you actually used different estimators: hence no wonder you obtained different results.
                  That said, you should check via -hausman- whether fixed or random effect specification fits your data better.
                  Since you're dealing with long N, small T panel dataset, I would consider -xtreg- as the first choice and switch to pooled OLS only if no evidence of panel-wise effect is proved.
                  Kind regards,
                  Carlo
                  (Stata 19.0)

                  Comment


                  • #10
                    Originally posted by Jeff Wooldridge View Post
                    There's no difference between including industry dummy variables and using industry fixed effects. They produce numerically identical results. Your final command does include industry fixed effects and clusters at the firm level (because, I trust, it is firm-level panel data).
                    Dear Jeff, would you still make the statement nowadays? I am asking because on a similar issue (https://www.statalist.org/forums/for...an-issue/page2 #20) you mentioned the following to my question if pooled OLS applying dummies for industry and Fiscal Year would lead to the same results as FE:
                    1. Your ability to keep a time-invariant variable while adding fixed effects "by hand" is an illusion. You should not be doing this. There is only one true fixed effects estimator. xtreg, fe does it properly, as Carlo emphasized. By putting in the dummies "by hand" you are deluding yourself. Stata is simply dropping variables until there is no collinearity left. From an identification perspective, you cannot estimate coefficients on the time-constant variables.
                    Am I overlooking something (most definitely yes but I would highly appreciate to know what it is) or are the two statements at odds? Thank you in advance for your clarification

                    Comment


                    • #11
                      I have several questions, searched everywhere but still confused for me. Thank you in advance.
                      1. I have panel data at firm level, having some time-invariant variables. After running hausman test, RE is suggested. Because of auto and heteroskedasticity, I use vce(cluster). But my prof said that the later defeat the former. In many posts, I see comments suggest using vce(cluster, robust) to account for this problem. So I dont know how to deal with this issue. Could any one clarify the contradiction between the two.
                      2. Moreover, I was recommend to add industry fixed effect. As far as I understand from this post, there is no need because my regression has firm level variables.
                      3. What should be proper model in this case?
                      My previous version is: xtreg roa firmfundamentals L.CSR y20 L.CSR_y20 (interaction term) ,re vce(cluster sector)

                      Comment


                      • #12
                        Minh:
                        1) as expected, due to demeaning, the -fe- estimator wipes out time-invariant variables. Not sure I got your prof.'s comment right, here. If you detect heteroskedasticity and/or autocorrelation, you shoud invoke non-default standard errors (-robust- or -vce(cluster)- options will do the very same job). The issue then is that -hausman- does not support non-default standard error (and you cannot go default for -hausman- and then impose non-default standard errors after the -hausman- verdict); hence, you should rely on the community-contributed module -xtoverid- that, being glorious but a bit old-fashioned, do nost support -fvvarlist- notation for categorical variables and interactions (you can try to prefix your code with -xi:- and see what happens).
                        Hence, I do not understand where the contradiction lies here: if you do have heteroskedasticity and/or autocorrelation and you go default standard errors, standard errors and related stuff will be unreliable.
                        2) Industry fixed effect will be wiped out by -fe- estimator if firms remain in the same industry during all the T dimension of your panel dataset; conversely, if you go -re-, a coefficient for this predictor, even if time-invariant, will be returned;
                        3) the idea of the right model is, unfortunately, only an idea. The best approach is to give a fair and true view of the data generating process under investigation; the literature in your research field can be a great support in this respect.l
                        Kind regards,
                        Carlo
                        (Stata 19.0)

                        Comment

                        Working...
                        X