Firm fixed effects

Theo Jansen

Join Date: Oct 2019

Posts: 6
#1

Firm fixed effects

30 Dec 2019, 05:38

Hi Statalist,

I have a question about firm-fixed effects.

My regression looks like:

Dependent var = independent var + controls

My dependent var is a continuous variable, and my independent var is a dummy variable. This dummy variable can, of course, be 1 or 0. It can go from 1 to 0 in consecutive years, but NOT from 0 to 1.

I made paneldata by xtset CIK fyear, where CIK is the company identifier.

My research supervisor said that when I include firm-fixed effects, for the B1 coefficient stata only looks at those firms that go from 1 to 0 in consecutive years (because all other firm-years are 'constant').

Is this true, and can anyone elaborate on this so that I will be able to defend this story more strongly?
If you need more information please feel free to ask...
Tags: None

Clyde Schechter

Join Date: Apr 2014
Posts: 30120

30 Dec 2019, 10:18

Well, your supervisor is partially right. The following output creates a toy data set similar to what you describe, and carries out the fixed effects regression twice, the second time excluding all CIKs where x never changes:

Code:

. //  CREATE TOY DATA SET TO ILLUSTRATE THE PROBLEM
. clear*

.
. set obs 10
number of observations (_N) was 0, now 10

. gen CIK = _n

. expand 10
(90 observations created)

. by CIK, sort: gen fyear = 2000 + _n

. sort CIK fyear

.
. set seed 1234

. gen dv = rnormal()

.
. gen x = runiformint(0, 1)

. by CIK (fyear): replace x = sum(x)
(74 real changes made)

. replace x = !x
(100 real changes made)

.
. xtset CIK fyear
       panel variable:  CIK (strongly balanced)
        time variable:  fyear, 2001 to 2010
                delta:  1 unit

.
. //  DO THE FIXED EFFECTS REGRESSION
. xtreg dv i.x, fe

Fixed-effects (within) regression               Number of obs     =        100
Group variable: CIK                             Number of groups  =         10

R-sq:                                           Obs per group:
     within  = 0.0004                                         min =         10
     between = 0.0133                                         avg =       10.0
     overall = 0.0011                                         max =         10

                                                F(1,89)           =       0.04
corr(u_i, Xb)  = 0.0364                         Prob > F          =     0.8468

------------------------------------------------------------------------------
          dv |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         1.x |  -.0571561   .2949952    -0.19   0.847    -.6433052    .5289931
       _cons |   .0169482   .1086287     0.16   0.876    -.1988948    .2327912
-------------+----------------------------------------------------------------
     sigma_u |   .3114797
     sigma_e |  .97838833
         rho |  .09202597   (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(9, 89) = 1.01                       Prob > F = 0.4365

.
. //  REPEAT IT ACTUALLY EXCLUDING THOSE CIK'S
. //  WHERE X NEVER CHANGES
. by CIK (x), sort: gen never_changes = x[1] == x[_N]

. xtreg dv i.x if !never_changes, fe

Fixed-effects (within) regression               Number of obs     =         60
Group variable: CIK                             Number of groups  =          6

R-sq:                                           Obs per group:
     within  = 0.0007                                         min =         10
     between = 0.3635                                         avg =       10.0
     overall = 0.0001                                         max =         10

                                                F(1,53)           =       0.04
corr(u_i, Xb)  = -0.1552                        Prob > F          =     0.8432

------------------------------------------------------------------------------
          dv |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         1.x |  -.0571561   .2875714    -0.20   0.843    -.6339513    .5196391
       _cons |  -.0644498   .1450582    -0.44   0.659    -.3553996       .2265
-------------+----------------------------------------------------------------
     sigma_u |  .24157018
     sigma_e |  .95376641
         rho |  .06028363   (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(5, 53) = 0.63                       Prob > F = 0.6806

You can see that the coefficient of x is the same either way, but the constant term, and the standard errors (and the CIs, t statistics, and p-values that derive from them) are different. The estimates of sigma_u and sigma_e are also different. Note also that Stata tells you it is using all 10 groups in the first regression and only 6 in the second one.

The groups with non-changing x are not excluded from the original analysis: but they make no contribution to the estimated coefficients (other than the constant term)s. They do, however, affect the standard error calculations.

This makes sense: the fixed-effects analysis estimatesw the within-CIK effect of x on dv. But if x never changes in a CIK, then there is no information in the data about how x effects dv in that CIK: it could be anything! Nevertheless, these observations still provide information about average levels of dv (hence their contribution to the constant term) and variation of dv within and between CIKs (hence their contribution to the standard errors and the sigma_u and sigma_e estimates).

Last edited by Clyde Schechter; 30 Dec 2019, 10:24.

Comment

Theo Jansen

Join Date: Oct 2019

Posts: 6
#3

20 Jan 2020, 11:16

Hi Clyde,

This is very, very helpful! It perfectly answers the question I raised above.

In my regression model I also have several (approximately 10) control variables. Can you tell me the difference with regard to those control variables in (1) running a regression without firm-fixed effects, and (2) running a regression with firm-fixed effects?

The reason I ask is the following: If the firm-fixed effects equation only looks at the changes in the x with regard to estimating the B1 coefficient, what is true about the coefficients on the control variables (and their significance)? In my analysis, the coefficients are different between test (1) and test (2). Can you tell me how I can explain why the coefficients on control variables are changing? In both tests, the overall sample is exactly the same. Thus, the only difference is the inclusion of firm-fixed effects.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30120
#4

20 Jan 2020, 11:39

The reason I ask is the following: If the firm-fixed effects equation only looks at the changes in the
x
with regard to estimating the B1 coefficient, what is true about the coefficients on the control variables (and their significance)?

Mathematically there is no difference between a "control" variable and x. They are just variables on the right hand side of the regression equation and they are all handled in exactly the same way. We tend to think of them as being different things, but that is just about the language we will use in explaining our results, and what we will emphasize and what we will de-emphasize. So the coefficients of the "control" variables (I prefer to call them covariates, because in observational data you are not contoling anything, you are simply making an adjustment for their effects) also represent the within-panel effects, just like B1 does.

In my analysis, the coefficients are different between test (1) and test (2). Can you tell me how I can explain why the coefficients on control variables are changing? In both tests, the overall sample is exactly the same. Thus, the only difference is the inclusion of firm-fixed effects.

Well, the only thing that really needs explaining is why you think they shouldn't change. There is no reason that the within-firm effects of a variable (estimated in a fixed-effects model) should be the same as the between-firm effects (which, averaged in with the within-firm effects, is what you see in the pooled OLS regression). In fact, they don't have to be close, or even have the same sign. It's easy to see with the following demonstration:

Code:

clear set obs 5 gen panel_id = _n expand 2 set seed 1234 by panel_id , sort: gen y = 4*panel_id - _n + 3 + rnormal(0, 0.5) by panel_id: gen x = panel_id + _n xtset panel_id xtreg y x, fe regress y x // GRAPH THE DATA TO SHOW WHAT'S HAPPENING separate y, by(panel_id) graph twoway connect y? x || lfit y x

More generally, whenever you add variables to a model, the coefficients of the variables that were previously there can change, and can change by any amount and in any way. Indeed, if this weren't the case, there would be no point in including covariates at all: the whole purpose of doing that is to attempt to separate the "pure" effect of a variable from effects that result from its correlation with something else.
1 like
Comment
Theo Jansen

Join Date: Oct 2019

Posts: 6
#5

20 Jan 2020, 13:14

Hi Clyde,

Thanks again for your quick and extensive response to my question! This really helps.

I thought that in the firm-fixed effects equation, the only thing that changed was that for the B1 coefficient, the model only looks at observations where there changes something in the x variable, and for all control variables/covariates the model just did the same as in the equation without firm-fixed effects, but that is not true then!

Thanks again!
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17714
#6

21 Jan 2020, 02:58

Theo:
as an aside to Clyde's helpful advice, you may want to take a look at -xtreg,fe- Methods and formulas section, -xtreg- entry, Stata .pdf manual.

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement