Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel data with fixed effects

    Hi,

    I'm running a regression with panel data and I want to use several fixed effects; firm specific, country specific, industry specific. Only when I run my model including firm specific fixed effects the r-squared increases sharply but my independent variables remain insignificant. When I include only country and industry specific effects the independent variables become significant but only the r-squared tends to be lower.

    Can someone explain to me why the r-squared is high when I use firm specific effect. And would it be logical to only look at my model including country and industry fixed effects?


    Code:
    reghdfe bda announcement_eligible l_lnassets l_roa, absorb(indus incorp) vce(cluster c)
    (converged in 9 iterations)
    
    HDFE Linear regression                            Number of obs   =      3,944
    Absorbing 2 HDFE groups                           F(   3,    492) =       5.04
    Statistics robust to heteroskedasticity           Prob > F        =     0.0019
                                                      R-squared       =     0.4422
                                                      Adj R-squared   =     0.4317
                                                      Within R-sq.    =     0.2642
    Number of clusters (c)       =        493         Root MSE        =     0.3304
    
                                                 (Std. Err. adjusted for 493 clusters in c)
    ---------------------------------------------------------------------------------------
                          |               Robust
                      bda |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    ----------------------+----------------------------------------------------------------
    announcement_eligible |   .0797131   .0234105     3.41   0.001     .0337163      .12571
               l_lnassets |  -.0348046   .0159292    -2.18   0.029    -.0661023   -.0035069
                    l_roa |  -3.438988   1.468924    -2.34   0.020    -6.325126   -.5528497
    ---------------------------------------------------------------------------------------
    
    Absorbed degrees of freedom:
    ------------------------------------------------------------------------+
              Absorbed FE |  Num. Coefs.  =   Categories  -   Redundant     |
    ----------------------+-------------------------------------------------|
                    indus |           56              56              0     |
                   incorp |           15              16              1     |
    ------------------------------------------------------------------------+
    Code:
    reghdfe bda announcement_eligible l_lnassets l_roa, absorb(c indus incorp) vce(cluster c)
    (converged in 3 iterations)
    
    HDFE Linear regression                            Number of obs   =      3,944
    Absorbing 3 HDFE groups                           F(   3,    492) =       0.93
    Statistics robust to heteroskedasticity           Prob > F        =     0.4281
                                                      R-squared       =     0.8968
                                                      Adj R-squared   =     0.8796
                                                      Within R-sq.    =     0.0389
    Number of clusters (c)       =        493         Root MSE        =     0.1521
    
                                                 (Std. Err. adjusted for 493 clusters in c)
    ---------------------------------------------------------------------------------------
                          |               Robust
                      bda |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    ----------------------+----------------------------------------------------------------
    announcement_eligible |    .004275   .0071644     0.60   0.551    -.0098017    .0183516
               l_lnassets |  -.0143254   .0292794    -0.49   0.625    -.0718535    .0432027
                    l_roa |  -.8736434   .5363002    -1.63   0.104    -1.927365    .1800779
    ---------------------------------------------------------------------------------------
    
    Absorbed degrees of freedom:
    ------------------------------------------------------------------------+
              Absorbed FE |  Num. Coefs.  =   Categories  -   Redundant     |
    ----------------------+-------------------------------------------------|
                        c |            0             493            493 *   |
                    indus |           55              56              1     |
                   incorp |           15              16              1     |
    ------------------------------------------------------------------------+
    * = fixed effect nested within cluster; treated as redundant for DoF computation
    
    .

  • #2
    The discussion is being continued. See the link for the original question: https://www.statalist.org/forums/for...-effects-model



    Originally posted by Jan Geurst View Post
    Can someone explain to me why the r-squared is high when I use firm-specific effect.
    In your case, I speculate that the problem arises because most likely your fixed effects are strongly correlated with each other (near multicollinearity). To understand the effect of near multicollinearity on the adjusted R-squared please see below (see here for the source):
    Code:
    * Clear the old data from memory
    clear
    * Set the number of observations to generate to 1000
    set obs 1000
    set seed 1234
    * Generate a positive explanatory variable.
    gen x = abs(rnormal())
    * Imagine we are interested in the coefficient on x.
    * Next, we create correlated control variables
    gen z1 = x^2 + rnormal()*10
    gen z2 = x^1.75 + rnormal()*10
    gen y  = 0.001*x + 0.03*z1 + 0.02*z2 + 0.01*rnormal()
    // Now, we run a regression without controls
    reg y x
    * adjusted R-squared often is quite low.
    * Next, we add controls
    reg y x z1 z2
    * Please see the sharp increase in the adjusted R-squared.
    In addition, in your results, the adj R-squared is high; however that of ‘within’ is low in the second output. It is likely that Stata drops the correlated fixed effect. To check this, try
    Code:
    xtreg y x i.ind_id i.cnt_id, fe vce(robust)
    Where, ind_id = industry id and cnt_id = country id
    In addition, you should investigate the ‘Absorbed degrees of freedom’ in your results. You did not share a sample of your data so I have created the following demonstration data-set (Note: It is often quite challenging to mimic the real panel data. Therefore, I stick to a very naïve example.)

    Code:
    clear all
    set obs 500
    set seed 1234
    gen firm_id = _n
    gen ind_id2 = runiformint(1,10)+ (firm_id/100)
    gen ind_id3 =round(ind_id2,1)
    egen ind_id = group(ind_id3)
    drop ind_id2 ind_id3
    gen cnt_id2 = runiformint(1,10) + (firm_id/100)
    gen cnt_id3 =round(cnt_id2,1)
    egen cnt_id = group(cnt_id3)
    drop cnt_id2 cnt_id3
    gen y= (firm_id^3+ rnormal())/1000
    gen x =  y*0.05 + rnormal()*1000
    gen z1= firm_id^0.2+ rnormal()
    expand 20
    bysort firm_id: gen year=_n+1998
    sort firm_id year
    by firm_id year: replace x = x + runiformint(-x+0.02,x)
    by firm_id year: replace y = y + runiformint(-y+0.02,y)
    by firm_id year: replace z1 = z1 + runiformint(-z1+0.05,z1)
    drop if y==.
    drop if x==.
    xtset firm_id year
    order year firm_id y x ind_id cnt_id
    label variable firm_id "Firm id"
    label variable ind_id "Industry id"
    label variable cnt_id "Country id"
    label variable y "Dependent Variable"
    label variable x "Independent Variable"
    label variable z1 "A Control Variable"
    label variable year "year"
    label data "Demonstration Datasets for the Panel Data"
    ssc install univar
    save  “demodata.dta"
    We check the descriptive statistics:
    Code:
    use “demodata.dta", clear
    tab year
    univar year y x firm_id ind_id cnt_id
    pwcorr y x z1 firm_id ind_id cnt_id
    Next, we run the regressions:
    Code:
    reghdfe y x z1 , absorb(ind_id cnt_id) vce(cluster firm_id)
    reghdfe y x z1, absorb(firm_id ind_id cnt_id)
    xtreg y x z1 i.ind_id i.cnt_id, fe vce(robust)
    Check (1) the changes in the coefficient of x (2) the change in R-squared and Within R-squared, the absorbed degrees of freedom, and etc.


    Originally posted by Jan Geurst View Post
    And would it be logical to only look at my model including country and industry fixed effects?
    Theoretically speaking, I do not think so. The reason is that firm-specific fixed effect might be one of the main drivers of your dependent variable. Instead, you might want to drop the industry-specific fixed effect. Alternatively, you might want to skim over the prior research in your field and follow their design.
    Last edited by Amin Sofla; 26 May 2018, 21:59.

    Comment

    Working...
    X