constant term in xtregar, fe

Giuseppe De Luca

Join Date: Nov 2023

Posts: 3
#1

constant term in xtregar, fe

24 Nov 2023, 07:56

Dear all, I have a question about the treatment of the constant in "xtregar,fe"

This command do not transform the constant term, its point estimate is simply adjusted ex-post dividing by (1-\rho) (see line 474 of xtreg.ado). In balanced and equally spaced panels, I see that this is coherent with the constant term in equation (16b) of Bhargava et al. (1982), but why not applying the same adjustment to the underlying standard error? Another problem is that this approach also ignores the variability of the transformed constant when the panel is unbalanced and unequally spaced. My intuition is that these issues can be solved by applying the C_i(\rho) and within transformations to all variables of the model (including the constant term). What do you think?

Thank you in advance for your feedback,
Giuseppe
Tags: None

Andrew Musau

Join Date: Oct 2014
Posts: 10260

24 Nov 2023, 09:09

What do you need the constant term for? In fixed effects models, the estimate of the constant term is meaningless. It is impossible to isolate the constant from the fixed effects, rendering any reported constant an artifact. Refer to https://www.stata.com/support/faqs/s...effects-model/ for further details on this. A simple way to see this is to note that the coefficient on the constant term changes if I change the base level for the time dummies.

Code:

webuse grunfeld, clear
xtregar invest mvalue kstock ib1.time, fe
xtregar invest mvalue kstock ib12.time, fe

Res.:

Code:

. xtregar invest mvalue kstock ib1.time, fe

FE (within) regression with AR(1) disturbances  Number of obs     =        190
Group variable: company                         Number of groups  =         10

R-squared:                                      Obs per group:
     Within  = 0.6513                                         min =         19
     Between = 0.7838                                         avg =       19.0
     Overall = 0.7798                                         max =         19

                                                F(20,160)         =      14.94
corr(u_i, Xb) = -0.1416                         Prob > F          =     0.0000

------------------------------------------------------------------------------
      invest | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      mvalue |   .0938865   .0104378     8.99   0.000     .0732729    .1145001
      kstock |   .4076907   .0373936    10.90   0.000      .333842    .4815393
             |
        time |
          2  |   22.60707   15.23516     1.48   0.140    -7.480873    52.69502
          3  |   29.02835   20.63253     1.41   0.161    -11.71886    69.77557
          4  |   32.19837   23.69301     1.36   0.176    -14.59298    78.98973
          5  |   17.30833   25.51947     0.68   0.499    -33.09011    67.70676
          6  |   50.88403   27.01934     1.88   0.061    -2.476513    104.2446
          7  |   79.24101   27.76837     2.85   0.005     24.40121    134.0808
          8  |   73.97942   28.25514     2.62   0.010      18.1783    129.7805
          9  |   56.66199   28.31439     2.00   0.047     .7438482    112.5801
         10  |   58.95475   28.56968     2.06   0.041     2.532449     115.377
         11  |   49.74642   28.40551     1.75   0.082    -6.351667    105.8445
         12  |   75.78518   28.05301     2.70   0.008     20.38324    131.1871
         13  |   57.76681   26.94116     2.14   0.034     4.560676    110.9729
         14  |   51.01013   26.04522     1.96   0.052    -.4266119    102.4469
         15  |   20.19058   24.97134     0.81   0.420    -29.12536    69.50651
         16  |   18.35979   23.75434     0.77   0.441     -28.5527    65.27228
         17  |   36.28607   21.77871     1.67   0.098    -6.724731    79.29688
         18  |   32.36863   18.82811     1.72   0.088    -4.815034    69.55229
         19  |   30.71474   14.03026     2.19   0.030     3.006356    58.42313
         20  |          0  (omitted)
             |
       _cons |  -124.2761   10.59512   -11.73   0.000    -145.2004   -103.3518
-------------+----------------------------------------------------------------
      rho_ar |  .68127018
     sigma_u |  96.029876
     sigma_e |  39.732352
     rho_fov |  .85383317   (fraction of variance because of u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(9,160) = 12.27                      Prob > F = 0.0000

. 
. xtregar invest mvalue kstock ib12.time, fe

FE (within) regression with AR(1) disturbances  Number of obs     =        190
Group variable: company                         Number of groups  =         10

R-squared:                                      Obs per group:
     Within  = 0.6513                                         min =         19
     Between = 0.7838                                         avg =       19.0
     Overall = 0.0867                                         max =         19

                                                F(20,160)         =      14.94
corr(u_i, Xb) = -0.0287                         Prob > F          =     0.0000

------------------------------------------------------------------------------
      invest | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      mvalue |   .0938865   .0104378     8.99   0.000     .0732729    .1145001
      kstock |   .4076907   .0373936    10.90   0.000      .333842    .4815393
             |
        time |
          1  |  -5412.661   2003.577    -2.70   0.008    -9369.529   -1455.794
          2  |  -3663.702   1357.974    -2.70   0.008    -6345.567   -981.8375
          3  |  -2481.169   918.5722    -2.70   0.008    -4295.259   -667.0792
          4  |  -1676.749    618.932    -2.71   0.007    -2899.079    -454.419
          5  |  -1145.771   415.1112    -2.76   0.006    -1965.575   -325.9673
          6  |  -740.3118   275.9892    -2.68   0.008    -1285.363   -195.2602
          7  |  -458.6017   181.4032    -2.53   0.012     -816.855   -100.3483
          8  |  -291.2613   117.0538    -2.49   0.014     -522.431   -60.09164
          9  |  -190.9902   73.08146    -2.61   0.010    -335.3189   -46.66152
         10  |  -108.5879   42.69178    -2.54   0.012    -192.8999   -24.27581
         11  |  -63.21994   21.45439    -2.95   0.004    -105.5902   -20.84964
         13  |   7.312056   14.24379     0.51   0.608    -20.81802    35.44214
         14  |   17.81225   18.37984     0.97   0.334    -18.48613    54.11063
         15  |  -1.250721   20.30361    -0.06   0.951    -41.34836    38.84691
         16  |   4.927904   20.95131     0.24   0.814    -36.44887    46.30468
         17  |   28.31076   20.20051     1.40   0.163    -11.58326    68.20479
         18  |   28.11072    18.0876     1.55   0.122    -7.610512    63.83195
         19  |   28.98939   13.81103     2.10   0.037     1.713972    56.26481
         20  |          0  (omitted)
             |
       _cons |   -127.964   10.91499   -11.72   0.000      -149.52   -106.4079
-------------+----------------------------------------------------------------
      rho_ar |  .68127018
     sigma_u |  96.029876
     sigma_e |  39.732352
     rho_fov |  .85383317   (fraction of variance because of u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(9,160) = 12.27                      Prob > F = 0.0000

Comment

Giuseppe De Luca

Join Date: Nov 2023
Posts: 3

24 Nov 2023, 11:17

Sorry, but this does not aswer my question and your example only shows that changing the covariates may change (of course) the interpretation of the constant. Obviously, the identification of the constant term in these models relies on the assumption that the fixed effects sum to zero (or some other identifiability restriction). However, once you make this assumption, the constant is identified/estimable and you want the right point estimates and the right SE. I know that this is meaningless anyway (at least when the panel is equally spaced), but this is not so meaningless in the case of unbalanced and unequally spaced panels. Here, the transformed constant is not constant anymore and so the ad-hoc approached used xtregar, fe may also bias the other coefficients in the conditional mean function. Let me illustrate these two points with some examples on the same dataset

First example on balanced and equilly spaced panel, which leads to wrong SE on the constant term

HTML Code:

. use "https://www.stata-press.com/data/r18/grunfeld", clear
. xtset
Panel variable: company (strongly balanced)
 Time variable: year, 1935 to 1954
         Delta: 1 year

. local ivar "company"
. local tvar "year"
. scalar delta=r(tdelta)
. local y         "invest"
. local X         "mvalue kstock"
. gen double cons=1

. * This is the official command
. xtregar `y' `X', fe

FE (within) regression with AR(1) disturbances  Number of obs     =        190
Group variable: company                         Number of groups  =         10

R-squared:                                      Obs per group:
     Within  = 0.5927                                         min =         19
     Between = 0.7989                                         avg =       19.0
     Overall = 0.7904                                         max =         19

                                                F(2,178)          =     129.49
corr(u_i, Xb) = -0.0454                         Prob > F          =     0.0000

------------------------------------------------------------------------------
      invest | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      mvalue |   .0949999   .0091377    10.40   0.000     .0769677     .113032
      kstock |    .350161   .0293747    11.92   0.000     .2921935    .4081286
       _cons |  -63.22022   5.648271   -11.19   0.000    -74.36641   -52.07402
-------------+----------------------------------------------------------------
      rho_ar |  .67210608
     sigma_u |  91.507609
     sigma_e |  40.992469
     rho_fov |   .8328647   (fraction of variance because of u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(9,178) = 11.53                      Prob > F = 0.0000

. scalar rho=e(rho_ar)
. local sigma=e(sigma_e)

. * Transformations used by xtregar, fe
. * AR(1) transform
. sort `ivar' `tvar'                                              
. qui by `ivar': gen double dif_t=(`tvar'[_n]-`tvar'[_n-1])/delta         if _n>1
. local vlist "`y' cons `X'"              
. foreach v of local vlist {                              
  2.         qui by `ivar': gen double AR1_`v'=(sqrt(1-rho^2))*`v'           if _n==1
  3.         qui by `ivar': replace AR1_`v'=(sqrt(1-rho^2))*                         ///
>                 (`v'[_n]*(1/sqrt((1-rho^(2*dif_t)))) -                                  ///
>                 `v'[_n-1]*(rho^(dif_t)/sqrt(1-rho^(2*dif_t))))                  if _n>1
  4. }

. * Drop first obs by panel
. sort `ivar' `tvar'                                              
. qui by `ivar': drop if _n==1

. * FE transform
. local DMAR1_y                                                   
. local DMAR1_X
. local DMAR1_X_c
. foreach v of local vlist {                              
  2.         qui bys `ivar': egen double MAR1_`v'=mean(AR1_`v')
  3.         sum AR1_`v', meanonly
  4.         qui gen double DMAR1_`v'=AR1_`v'-MAR1_`v' +r(mean)
  5.         drop AR1_`v' MAR1_`v'
  6.         if `:list v in y'       local DMAR1_y   "DMAR1_`v'"             
  7.         if `:list v in X'       local DMAR1_X   "`DMAR1_X' DMAR1_`v'"           
  8.         if `:list v in X'|"`v'"=="cons" local DMAR1_X_c "`DMAR1_X_c' DMAR1_`v'"         
  9. }

.
. * Reproducing xtregar, fe: OLS with "non-transformed" constant and
. *       ex-post adjustments for both (1-rho) and dof
. noi regress `DMAR1_y' `DMAR1_X'

      Source |       SS           df       MS      Number of obs   =       190
-------------+----------------------------------   F(2, 187)       =    136.04
       Model |  435202.142         2  217601.071   Prob > F        =    0.0000
    Residual |  299108.082       187  1599.50846   R-squared       =    0.5927
-------------+----------------------------------   Adj R-squared   =    0.5883
       Total |  734310.224       189  3885.23928   Root MSE        =    39.994

------------------------------------------------------------------------------
DMAR1_invest | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
DMAR1_mvalue |   .0949999   .0089151    10.66   0.000     .0774128     .112587
DMAR1_kstock |   .3501611   .0286591    12.22   0.000     .2936243    .4066978
       _cons |  -20.72953   5.510674    -3.76   0.000     -31.6006   -9.858447
------------------------------------------------------------------------------

. noi di _n "Adj. mvalue:    b[mvalue]=" %9.8g _b[DMAR1_mvalue]                   /*
>         */    "       SE[mvalue]="  %9.7g   _se[DMAR1_mvalue]*`sigma'/e(rmse)

Adj. mvalue:    b[mvalue]= .0949999       SE[mvalue]= .0091377

. noi di _n " Adj. const:    b[_cons]=" %9.8g _b[_cons]/(1-rho)                   /*
>         */    "       wrong_SE[cons]="  %9.7g   _se[_cons]*`sigma'/e(rmse)      /*
>         */    "        correct_SE[cons]="  %9.7g   _se[_cons]*`sigma'/(e(rmse)*(1-rho))

 Adj. const:    b[_cons]=-63.22022       wrong_SE[cons]= 5.648271        correct_SE[cons]= 17.22591

. * My proposal: OLS with "transformed constant" and ex-post adjustment for dof only
. noi regress `DMAR1_y' `DMAR1_X_c', nocons

      Source |       SS           df       MS      Number of obs   =       190
-------------+----------------------------------   F(3, 187)       =    215.91
       Model |  1036025.68         3  345341.894   Prob > F        =    0.0000
    Residual |  299108.082       187  1599.50846   R-squared       =    0.7760
-------------+----------------------------------   Adj R-squared   =    0.7724
       Total |  1335133.76       190  7027.01981   Root MSE        =    39.994

------------------------------------------------------------------------------
DMAR1_invest | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
  DMAR1_cons |  -63.22022   16.80627    -3.76   0.000    -96.37447   -30.06596
DMAR1_mvalue |   .0949999   .0089151    10.66   0.000     .0774128     .112587
DMAR1_kstock |   .3501611   .0286591    12.22   0.000     .2936243    .4066978
------------------------------------------------------------------------------

. noi di _n "Adj. mvalue:    b[mvalue]=" %9.8g _b[DMAR1_mvalue]                                           /*
>         */    "       SE[mvalue]="  %9.7g   _se[DMAR1_mvalue]*`sigma'/e(rmse)

Adj. mvalue:    b[mvalue]= .0949999       SE[mvalue]= .0091377

. noi di _n " Adj. const:    b[_cons]=" %9.8g _b[DMAR1_cons]                                              /*
>         */    "       wrong_SE[cons]="  %9.7g   _se[DMAR1_cons]*(1-rho)*`sigma'/e(rmse) /*
>         */    "        correct_SE[cons]="  %9.7g   _se[DMAR1_cons]*`sigma'/e(rmse)

 Adj. const:    b[_cons]=-63.22022       wrong_SE[cons]= 5.648271        correct_SE[cons]= 17.22591

Now the second example on unbalanced and unequally spaced panels, which apparently leads to wrong estimates of all regression coefficients

HTML Code:

. use "https://www.stata-press.com/data/r18/grunfeld", clear

. * Make the panel unbalanced and unequally spaced
. * the rest of the code is identical to the first example
. drop if `tvar'==1939|`ivar'>7 & (`tvar'==1944|`tvar'==1948)     
(16 observations deleted)

. xtset, clear
. xtset `ivar' `tvar'

Panel variable: company (unbalanced)
 Time variable: year, 1935 to 1954, but with gaps
         Delta: 1 year

. scalar delta=r(tdelta)
. local ivar "company"
. local tvar "year"

. scalar delta=r(tdelta)

. local y         "invest"
. local X         "mvalue kstock"
. gen double cons=1
.
. * This is the official command
. xtregar `y' `X', fe

FE (within) regression with AR(1) disturbances  Number of obs     =        174
Group variable: company                         Number of groups  =         10

R-squared:                                      Obs per group:
     Within  = 0.6224                                         min =         16
     Between = 0.7971                                         avg =       17.4
     Overall = 0.7922                                         max =         18

                                                F(2,162)          =     133.52
corr(u_i, Xb) = -0.0564                         Prob > F          =     0.0000

------------------------------------------------------------------------------
      invest | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      mvalue |   .0979694   .0088802    11.03   0.000     .0804337    .1155052
      kstock |   .3516487   .0300888    11.69   0.000     .2922318    .4110657
       _cons |  -68.71156   5.902539   -11.64   0.000     -80.3674   -57.05572
-------------+----------------------------------------------------------------
      rho_ar |  .67335537
     sigma_u |  94.027441
     sigma_e |   40.78577
     rho_fov |  .84164339   (fraction of variance because of u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(9,162) = 11.68                      Prob > F = 0.0000

. scalar rho=e(rho_ar)
. local sigma=e(sigma_e)

. * Transformations used by xtregar, fe
. * AR(1) transform
. sort `ivar' `tvar'                                              
. qui by `ivar': gen double dif_t=(`tvar'[_n]-`tvar'[_n-1])/delta         if _n>1
. local vlist "`y' cons `X'"              
. foreach v of local vlist {                              
  2.         qui by `ivar': gen double AR1_`v'=(sqrt(1-rho^2))*`v'           if _n==1
  3.         qui by `ivar': replace AR1_`v'=(sqrt(1-rho^2))*                         ///
>                 (`v'[_n]*(1/sqrt((1-rho^(2*dif_t)))) -                                  ///
>                 `v'[_n-1]*(rho^(dif_t)/sqrt(1-rho^(2*dif_t))))                  if _n>1
  4. }

. * Drop first obs by panel
. sort `ivar' `tvar'                                              
. qui by `ivar': drop if _n==1

. * FE transform
. local DMAR1_y                                                   
. local DMAR1_X
. local DMAR1_X_c
. foreach v of local vlist {                              
  2.         qui bys `ivar': egen double MAR1_`v'=mean(AR1_`v')
  3.         sum AR1_`v', meanonly
  4.         qui gen double DMAR1_`v'=AR1_`v'-MAR1_`v' +r(mean)
  5.         drop AR1_`v' MAR1_`v'
  6.         if `:list v in y' local DMAR1_y "DMAR1_`v'"             
  7.         if `:list v in X' local DMAR1_X "`DMAR1_X' DMAR1_`v'"           
  8.         if `:list v in X'|"`v'"=="cons" local DMAR1_X_c "`DMAR1_X_c' DMAR1_`v'"         
  9. }

. * Reproducing xtregar, fe: OLS with "non-transformed" constant and
. *       ex-post adjustments for both (1-rho) and dof
. noi regress `DMAR1_y' `DMAR1_X'

      Source |       SS           df       MS      Number of obs   =       174
-------------+----------------------------------   F(2, 171)       =    140.93
       Model |  444202.784         2  222101.392   Prob > F        =    0.0000
    Residual |  269483.601       171  1575.92749   R-squared       =    0.6224
-------------+----------------------------------   Adj R-squared   =    0.6180
       Total |  713686.385       173  4125.35483   Root MSE        =    39.698

------------------------------------------------------------------------------
DMAR1_invest | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
DMAR1_mvalue |   .0979694   .0086433    11.33   0.000     .0809081    .1150307
DMAR1_kstock |   .3516487   .0292863    12.01   0.000     .2938394     .409458
       _cons |  -22.44426    5.74511    -3.91   0.000    -33.78473   -11.10379
------------------------------------------------------------------------------

. noi di _n "Adj. mvalue:    b[mvalue]=" %9.8g _b[DMAR1_mvalue]                   /*
>         */    "       SE[mvalue]="  %9.7g   _se[DMAR1_mvalue]*`sigma'/e(rmse)

Adj. mvalue:    b[mvalue]= .0979694       SE[mvalue]= .0088802

. noi di _n " Adj. const:    b[_cons]=" %9.8g _b[_cons]/(1-rho)                   /*
>         */    "       wrong_SE[cons]="  %9.7g   _se[_cons]*`sigma'/e(rmse)      /*
>         */    "        correct_SE[cons]="  %9.7g   _se[_cons]*`sigma'/(e(rmse)*(1-rho))

 Adj. const:    b[_cons]=-68.71156       wrong_SE[cons]= 5.902539        correct_SE[cons]= 18.07022

. * My proposal: OLS with "transformed constant" and ex-post adjustment for dof only
. noi regress `DMAR1_y' `DMAR1_X_c', nocons

      Source |       SS           df       MS      Number of obs   =       174
-------------+----------------------------------   F(3, 171)       =    225.73
       Model |  1065081.33         3   355027.11   Prob > F        =    0.0000
    Residual |  268949.328       171  1572.80309   R-squared       =    0.7984
-------------+----------------------------------   Adj R-squared   =    0.7949
       Total |  1334030.66       174  7666.84286   Root MSE        =    39.659

------------------------------------------------------------------------------
DMAR1_invest | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
  DMAR1_cons |  -66.45554   16.80825    -3.95   0.000    -99.63391   -33.27716
DMAR1_mvalue |   .0987793    .008723    11.32   0.000     .0815606    .1159979
DMAR1_kstock |   .3497412    .028857    12.12   0.000     .2927794    .4067031
------------------------------------------------------------------------------

. noi di _n "Adj. mvalue:    b[mvalue]=" %9.8g _b[DMAR1_mvalue]                                           /*
>         */    "       SE[mvalue]="  %9.7g   _se[DMAR1_mvalue]*`sigma'/e(rmse)

Adj. mvalue:    b[mvalue]= .0987793       SE[mvalue]= .0089709

. noi di _n " Adj. const:    b[_cons]=" %9.8g _b[DMAR1_cons]                                              /*
>         */    "       wrong_SE[cons]="  %9.7g   _se[DMAR1_cons]*(1-rho)*`sigma'/e(rmse) /*
>         */    "        correct_SE[cons]="  %9.7g   _se[DMAR1_cons]*`sigma'/e(rmse)

 Adj. const:    b[_cons]=-66.45554       wrong_SE[cons]= 5.646372        correct_SE[cons]= 17.28598

You see that all estimated coefficients are now different in the two approaches (i.e. "non-transformed constant" and "transformed constant"). Which approach is correct?

Comment

Andrew Musau

Join Date: Oct 2014

Posts: 10260
#4

24 Nov 2023, 16:25

I do not agree that in the case of an unbalanced panel, the estimate of the constant is meaningful. Second, if you want to do the transformation and estimate using regress, then you must take into account that your variables after transformation are generated regressors, and there needs to be a degrees of freedom adjustment. I do not see that you apply such an adjustment in your illustration, but I may just be missing it. As I personally have no use for the estimate of the constant in FE models, if you believe that you have a case, compile a reproducible example and send it to Stata Technical Services (see https://www.stata.com/support/tech-support/contact/). If your illustrations have merit, they will let you know and make the appropriate adjustments to the command. You should update this thread for the benefit of users of the command once you hear back from them.

Last edited by Andrew Musau; 24 Nov 2023, 16:43.
1 like
Comment
Giuseppe De Luca

Join Date: Nov 2023

Posts: 3
#5

25 Nov 2023, 02:10

Thank you Andrew, writing to the Stata Tchnical Services is a very good idea, also becasue this is a technical issue. Anyway, my code always adjusts the SE for the degree of freedom in the last diplay command:

Code:

`sigma'/e(rmse)

In fact, I showed in both examples how to reproduce exactly the point estimates and SE of the constant and the coefficient of "mvalue" in the output of xtregar, fe. The question is whether this command is correct or not, because different approaches for the constant term may lead to different estimates of the other regression coefficients (irrespective of whether you think that the constant is meaningful or not). Thank you again for your advise.
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10260
#6

25 Nov 2023, 05:03

As I said, send the query to Tech Support. The whole thing looks backwards to me. You first run xtregar to obtain estimates of rho and sigma_e, do the transformation, run regress and then adjust the regress estimates using the estimates you got from the same command that you claim gets the calculations wrong. To make a compelling case, do everything outside xtregar and then show your estimates are different from those of xtregar.
Comment

Announcement