Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fixed effect Regression

    Hello,

    I am new to Stata and Econometrics.
    What kind of assumptions should be tested for using FE?
    I already checked the normality of residuals and since it is not normally distributed, I took the Log.
    This changes the results of my analysis. Is it needed to take the log?
    Also, I read something about stationarity (xtserial), and my results are significant. How to resolve this?


    Could someone give me some recommendations?

    Thanks in advance!
    Attached Files
    Last edited by Larissa Staal; 25 Jul 2022, 09:43.

  • #2
    Larissa:
    welcome to this forum.
    1) and 2) there's no need to log if the only reason for that is non-normality of the residual distribution. Conversely, if you detect heteroskedastcity and/or autocorrelation you can safely invoke -robust- or -vce(cluster panelid)- (unlike -regress-, they do the very same job under -xtreg-) to deal with these nuisances. That said, with such a limited number of panels the non-default standard errors can be inaccurate. Hence, you have to compare the standard errors under the deafult and non-default assumptions.
    3) With a short panel dataset (N>T) you should not worry about stationarity.
    4) in your second code you logged in both side of your regression equation. Beware that this way you have log-linear model for some variables and a log-log one for other variables: their interpretation is pretty different.
    5) I would test whether the functional form of your regressand is correctly specified, just mimicking the following toy-example:
    Code:
    . use "https://www.stata-press.com/data/r17/nlswork.dta"
    (National Longitudinal Survey of Young Women, 14-24 years old in 1968)
    
    . xtreg ln_wage c.age##c.age, fe
    
    Fixed-effects (within) regression               Number of obs     =     28,510
    Group variable: idcode                          Number of groups  =      4,710
    
    R-squared:                                      Obs per group:
         Within  = 0.1087                                         min =          1
         Between = 0.1006                                         avg =        6.1
         Overall = 0.0865                                         max =         15
    
                                                    F(2,23798)        =    1451.88
    corr(u_i, Xb) = 0.0440                          Prob > F          =     0.0000
    
    ------------------------------------------------------------------------------
         ln_wage | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
             age |   .0539076   .0028078    19.20   0.000     .0484041    .0594112
                 |
     c.age#c.age |  -.0005973   .0000465   -12.84   0.000    -.0006885   -.0005061
                 |
           _cons |    .639913   .0408906    15.65   0.000     .5597649    .7200611
    -------------+----------------------------------------------------------------
         sigma_u |   .4039153
         sigma_e |  .30245467
             rho |  .64073314   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    F test that all u_i=0: F(4709, 23798) = 8.74                 Prob > F = 0.0000
    
    . predict fitted, xb
    (24 missing values generated)
    
    . g sq_fitted=fitted^2
    (24 missing values generated)
    
    . xtreg ln_wage fitted sq_fitted, fe
    
    Fixed-effects (within) regression               Number of obs     =     28,510
    Group variable: idcode                          Number of groups  =      4,710
    
    R-squared:                                      Obs per group:
         Within  = 0.1092                                         min =          1
         Between = 0.1033                                         avg =        6.1
         Overall = 0.0881                                         max =         15
    
                                                    F(2,23798)        =    1457.96
    corr(u_i, Xb) = 0.0467                          Prob > F          =     0.0000
    
    ------------------------------------------------------------------------------
         ln_wage | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
          fitted |   2.569185    .476861     5.39   0.000     1.634507    3.503863
       sq_fitted |    -.47432   .1440324    -3.29   0.001    -.7566326   -.1920074
           _cons |  -1.290258   .3930351    -3.28   0.001    -2.060631   -.5198837
    -------------+----------------------------------------------------------------
         sigma_u |    .403403
         sigma_e |  .30238578
             rho |  .64025357   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    F test that all u_i=0: F(4709, 23798) = 8.72                 Prob > F = 0.0000
    
    . test sq_fitted
    
     ( 1)  sq_fitted = 0
    
           F(  1, 23798) =   10.84
                Prob > F =    0.0010
    
    .
    Since the outcome of -test- on the -sq_fitted- reaches statistical significance, the model is misspecified (that is, it needs more predictors and/or interactions to give afair and true view of the data generating process).
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment

    Working...
    X