Panel Data Analysis Fixed Effects

Iris Landi

Join Date: Jul 2024
Posts: 3

Panel Data Analysis Fixed Effects

11 Jul 2024, 09:17

Good morning,

I'm writing my master thesis and I find myself having an issue regarding the inclusion of Year fixed effects in my regression. I'm now studying the effect of Gender Development Index on GDP per capita in a panel of 27 countries from 1990 to 2020. However, the coeff. reverts when including i.Year, going from a positive to a negative value (and completely changing the policy implications of my analysis).

Even though the "expectations" for my analysis were relying on a positive correlation, why do you think the inclusion of time fixed effects might reverse the sign of the coefficient? Would it make sense, in this specific case, not to include them?

without fixed effects:

Code:

xtreg ln_GDPpc GDI GINI INV_r UN_r POP_g HCPI GOV_exp, fe

Fixed-effects (within) regression               Number of obs     =        614
Group variable: Country_ID                      Number of groups  =         27

R-squared:                                      Obs per group:
     Within  = 0.6758                                         min =          6
     Between = 0.2723                                         avg =       22.7
     Overall = 0.3684                                         max =         31

                                                F(7, 580)         =     172.74
corr(u_i, Xb) = -0.4103                         Prob > F          =     0.0000

------------------------------------------------------------------------------
    ln_GDPpc | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         GDI |   6.744834   .5237326    12.88   0.000     5.716191    7.773478
        GINI |  -.0214288   .0043979    -4.87   0.000    -.0300666   -.0127911
       INV_r |   .0253672   .0034532     7.35   0.000     .0185849    .0321494
        UN_r |  -.0620038   .0085478    -7.25   0.000    -.0787922   -.0452153
       POP_g |  -.3070663   .0485502    -6.32   0.000    -.4024218   -.2117107
        HCPI |  -.0123887   .0022354    -5.54   0.000    -.0167792   -.0079981
     GOV_exp |   .0245106   .0043933     5.58   0.000     .0158818    .0331393
       _cons |   1.922476   .5395802     3.56   0.000     .8627065    2.982245
-------------+----------------------------------------------------------------
     sigma_u |  .68725516
     sigma_e |  .30698036
         rho |  .83366721   (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(26, 580) = 62.12                    Prob > F = 0.0000

with Year fixed effects

Code:

xtreg ln_GDPpc GDI GINI INV_r UN_r POP_g HCPI GOV_exp i.Year, fe

Fixed-effects (within) regression               Number of obs     =        614
Group variable: Country_ID                      Number of groups  =         27

R-squared:                                      Obs per group:
     Within  = 0.9097                                         min =          6
     Between = 0.0484                                         avg =       22.7
     Overall = 0.3801                                         max =         31

                                                F(37, 550)        =     149.67
corr(u_i, Xb) = -0.0162                         Prob > F          =     0.0000

------------------------------------------------------------------------------
    ln_GDPpc | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         GDI |  -.1145937   .3549888    -0.32   0.747    -.8118934     .582706
        GINI |    .016365   .0028453     5.75   0.000      .010776    .0219539
       INV_r |   .0049661   .0020244     2.45   0.014     .0009896    .0089425
        UN_r |  -.0106453   .0049719    -2.14   0.033    -.0204114   -.0008792
       POP_g |  -.0181923   .0300618    -0.61   0.545    -.0772423    .0408578
        HCPI |  -.0071328   .0013566    -5.26   0.000    -.0097976    -.004468
     GOV_exp |  -.0067474   .0026602    -2.54   0.011    -.0119728    -.001522
             |

Year |
       1991  |   .0552928   .0703768     0.79   0.432    -.0829475     .193533
       1992  |   .0339597   .0723705     0.47   0.639    -.1081967     .176116
       1993  |   .0529695   .0722955     0.73   0.464    -.0890396    .1949787
       1994  |   .1105175    .072307     1.53   0.127    -.0315141    .2525491
       
(...)

I'm very pleased in advance for your support!

Tags: None

Clyde Schechter

Join Date: Apr 2014

Posts: 30063
#2

11 Jul 2024, 09:37

Time is included when the outcome variable is subject to time effects. A variable like GDPpc, or its log transform, is such a variable. Not only are there long term trends in this variable, but there are often yearly shocks of appreciable magnitude. At a minimum, including these in the model reduces the residual variance, thereby improving the efficiency of the model. But there is also the possibility that the focal independent variable, GDI in your case, might itself be subject to time effects. And if so, it may be that the GDI variable serves as a "proxy" for the time effects. The fact that your GDI coefficient changes drastically depending on whether you include the time effects tells us that GDI is, indeed, itself time sensitive, and that the effects of time on GDI are (using language loosely here) parallel to the time effects on GDPpc. I think that the best interpretation of your results is that the apparent effect of GDI in your "timeless" model are the result of this confounding (aka omitted variable bias). The model with time effects demonstrates that the apparent GDI effect is just the consequence of GDI serving as a proxy for time when time is omitted. If you are looking to establish something that you can plausibly argue is a causal relationship, you have to go with the model that includes time effects here. The "timeless" model is showing you an association that is demonstrably non-causal--such associations are useful for some purposes, but not others. So ultimately it depends on your research goal. But typically studies like this hope to demonstrate a causal relationship, and the timeless model will not do that here.

Last edited by Clyde Schechter; 11 Jul 2024, 09:43.
3 likes
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17700
#3

11 Jul 2024, 12:02

Iris:
Clyde gave an excellent advice.
On a different note, while your R_sq within is sky-rocketing (0.9097), most of your coefficients do not reach statistical significance.
I would take an additional look at your regression specification, as it probably suffers from overfitting (see multiple regression - High R-squared although many insignificant coefficients - Cross Validated (stackexchange.com)).
In addition, with 27 panels you should check whether or not cluster-robust standard errors are the way to go.

Kind regards,
Carlo
(Stata 19.0)
3 likes
Comment
Iris Landi

Join Date: Jul 2024

Posts: 3
#4

12 Jul 2024, 03:03

Dear @Carlo and @Clyde, I would like to thank you both for your valuable and meaningful feedbacks
Comment

Announcement

Panel Data Analysis Fixed Effects

Comment

Comment

Comment