Panel data with fixed effects

Jan Geurst

Join Date: May 2018
Posts: 18

Panel data with fixed effects

26 May 2018, 02:58

Hi,

I'm running a regression with panel data and I want to use several fixed effects; firm specific, country specific, industry specific. Only when I run my model including firm specific fixed effects the r-squared increases sharply but my independent variables remain insignificant. When I include only country and industry specific effects the independent variables become significant but only the r-squared tends to be lower.

Can someone explain to me why the r-squared is high when I use firm specific effect. And would it be logical to only look at my model including country and industry fixed effects?

Code:

reghdfe bda announcement_eligible l_lnassets l_roa, absorb(indus incorp) vce(cluster c)
(converged in 9 iterations)

HDFE Linear regression                            Number of obs   =      3,944
Absorbing 2 HDFE groups                           F(   3,    492) =       5.04
Statistics robust to heteroskedasticity           Prob > F        =     0.0019
                                                  R-squared       =     0.4422
                                                  Adj R-squared   =     0.4317
                                                  Within R-sq.    =     0.2642
Number of clusters (c)       =        493         Root MSE        =     0.3304

                                             (Std. Err. adjusted for 493 clusters in c)
---------------------------------------------------------------------------------------
                      |               Robust
                  bda |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
----------------------+----------------------------------------------------------------
announcement_eligible |   .0797131   .0234105     3.41   0.001     .0337163      .12571
           l_lnassets |  -.0348046   .0159292    -2.18   0.029    -.0661023   -.0035069
                l_roa |  -3.438988   1.468924    -2.34   0.020    -6.325126   -.5528497
---------------------------------------------------------------------------------------

Absorbed degrees of freedom:
------------------------------------------------------------------------+
          Absorbed FE |  Num. Coefs.  =   Categories  -   Redundant     |
----------------------+-------------------------------------------------|
                indus |           56              56              0     |
               incorp |           15              16              1     |
------------------------------------------------------------------------+

Code:

reghdfe bda announcement_eligible l_lnassets l_roa, absorb(c indus incorp) vce(cluster c)
(converged in 3 iterations)

HDFE Linear regression                            Number of obs   =      3,944
Absorbing 3 HDFE groups                           F(   3,    492) =       0.93
Statistics robust to heteroskedasticity           Prob > F        =     0.4281
                                                  R-squared       =     0.8968
                                                  Adj R-squared   =     0.8796
                                                  Within R-sq.    =     0.0389
Number of clusters (c)       =        493         Root MSE        =     0.1521

                                             (Std. Err. adjusted for 493 clusters in c)
---------------------------------------------------------------------------------------
                      |               Robust
                  bda |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
----------------------+----------------------------------------------------------------
announcement_eligible |    .004275   .0071644     0.60   0.551    -.0098017    .0183516
           l_lnassets |  -.0143254   .0292794    -0.49   0.625    -.0718535    .0432027
                l_roa |  -.8736434   .5363002    -1.63   0.104    -1.927365    .1800779
---------------------------------------------------------------------------------------

Absorbed degrees of freedom:
------------------------------------------------------------------------+
          Absorbed FE |  Num. Coefs.  =   Categories  -   Redundant     |
----------------------+-------------------------------------------------|
                    c |            0             493            493 *   |
                indus |           55              56              1     |
               incorp |           15              16              1     |
------------------------------------------------------------------------+
* = fixed effect nested within cluster; treated as redundant for DoF computation

.

Tags: None

Amin Sofla

Join Date: May 2018
Posts: 67

26 May 2018, 21:32

The discussion is being continued. See the link for the original question: https://www.statalist.org/forums/for...-effects-model

Originally posted by Jan Geurst View Post

Can someone explain to me why the r-squared is high when I use firm-specific effect.

In your case, I speculate that the problem arises because most likely your fixed effects are strongly correlated with each other (near multicollinearity). To understand the effect of near multicollinearity on the adjusted R-squared please see below (see here for the source):

Code:

* Clear the old data from memory
clear
* Set the number of observations to generate to 1000
set obs 1000
set seed 1234
* Generate a positive explanatory variable.
gen x = abs(rnormal())
* Imagine we are interested in the coefficient on x.
* Next, we create correlated control variables
gen z1 = x^2 + rnormal()*10
gen z2 = x^1.75 + rnormal()*10
gen y  = 0.001*x + 0.03*z1 + 0.02*z2 + 0.01*rnormal()
// Now, we run a regression without controls
reg y x
* adjusted R-squared often is quite low.
* Next, we add controls
reg y x z1 z2
* Please see the sharp increase in the adjusted R-squared.

In addition, in your results, the adj R-squared is high; however that of ‘within’ is low in the second output. It is likely that Stata drops the correlated fixed effect. To check this, try

Code:

xtreg y x i.ind_id i.cnt_id, fe vce(robust)

Where, ind_id = industry id and cnt_id = country id
In addition, you should investigate the ‘Absorbed degrees of freedom’ in your results. You did not share a sample of your data so I have created the following demonstration data-set (Note: It is often quite challenging to mimic the real panel data. Therefore, I stick to a very naïve example.)

Code:

clear all
set obs 500
set seed 1234
gen firm_id = _n
gen ind_id2 = runiformint(1,10)+ (firm_id/100)
gen ind_id3 =round(ind_id2,1)
egen ind_id = group(ind_id3)
drop ind_id2 ind_id3
gen cnt_id2 = runiformint(1,10) + (firm_id/100)
gen cnt_id3 =round(cnt_id2,1)
egen cnt_id = group(cnt_id3)
drop cnt_id2 cnt_id3
gen y= (firm_id^3+ rnormal())/1000
gen x =  y*0.05 + rnormal()*1000
gen z1= firm_id^0.2+ rnormal()
expand 20
bysort firm_id: gen year=_n+1998
sort firm_id year
by firm_id year: replace x = x + runiformint(-x+0.02,x)
by firm_id year: replace y = y + runiformint(-y+0.02,y)
by firm_id year: replace z1 = z1 + runiformint(-z1+0.05,z1)
drop if y==.
drop if x==.
xtset firm_id year
order year firm_id y x ind_id cnt_id
label variable firm_id "Firm id"
label variable ind_id "Industry id"
label variable cnt_id "Country id"
label variable y "Dependent Variable"
label variable x "Independent Variable"
label variable z1 "A Control Variable"
label variable year "year"
label data "Demonstration Datasets for the Panel Data"
ssc install univar
save  “demodata.dta"

We check the descriptive statistics:

Code:

use “demodata.dta", clear
tab year
univar year y x firm_id ind_id cnt_id
pwcorr y x z1 firm_id ind_id cnt_id

Next, we run the regressions:

Code:

reghdfe y x z1 , absorb(ind_id cnt_id) vce(cluster firm_id)
reghdfe y x z1, absorb(firm_id ind_id cnt_id)
xtreg y x z1 i.ind_id i.cnt_id, fe vce(robust)

Check (1) the changes in the coefficient of x (2) the change in R-squared and Within R-squared, the absorbed degrees of freedom, and etc.

Originally posted by Jan Geurst View Post

And would it be logical to only look at my model including country and industry fixed effects?

Theoretically speaking, I do not think so. The reason is that firm-specific fixed effect might be one of the main drivers of your dependent variable. Instead, you might want to drop the industry-specific fixed effect. Alternatively, you might want to skim over the prior research in your field and follow their design.

Last edited by Amin Sofla; 26 May 2018, 21:59.

Announcement

Panel data with fixed effects

Comment