Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • constant term in xtregar, fe

    Dear all, I have a question about the treatment of the constant in "xtregar,fe"

    This command do not transform the constant term, its point estimate is simply adjusted ex-post dividing by (1-\rho) (see line 474 of xtreg.ado). In balanced and equally spaced panels, I see that this is coherent with the constant term in equation (16b) of Bhargava et al. (1982), but why not applying the same adjustment to the underlying standard error? Another problem is that this approach also ignores the variability of the transformed constant when the panel is unbalanced and unequally spaced. My intuition is that these issues can be solved by applying the C_i(\rho) and within transformations to all variables of the model (including the constant term). What do you think?

    Thank you in advance for your feedback,
    Giuseppe

  • #2
    What do you need the constant term for? In fixed effects models, the estimate of the constant term is meaningless. It is impossible to isolate the constant from the fixed effects, rendering any reported constant an artifact. Refer to https://www.stata.com/support/faqs/s...effects-model/ for further details on this. A simple way to see this is to note that the coefficient on the constant term changes if I change the base level for the time dummies.

    Code:
    webuse grunfeld, clear
    xtregar invest mvalue kstock ib1.time, fe
    xtregar invest mvalue kstock ib12.time, fe
    Res.:

    Code:
    . xtregar invest mvalue kstock ib1.time, fe
    
    FE (within) regression with AR(1) disturbances  Number of obs     =        190
    Group variable: company                         Number of groups  =         10
    
    R-squared:                                      Obs per group:
         Within  = 0.6513                                         min =         19
         Between = 0.7838                                         avg =       19.0
         Overall = 0.7798                                         max =         19
    
                                                    F(20,160)         =      14.94
    corr(u_i, Xb) = -0.1416                         Prob > F          =     0.0000
    
    ------------------------------------------------------------------------------
          invest | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
          mvalue |   .0938865   .0104378     8.99   0.000     .0732729    .1145001
          kstock |   .4076907   .0373936    10.90   0.000      .333842    .4815393
                 |
            time |
              2  |   22.60707   15.23516     1.48   0.140    -7.480873    52.69502
              3  |   29.02835   20.63253     1.41   0.161    -11.71886    69.77557
              4  |   32.19837   23.69301     1.36   0.176    -14.59298    78.98973
              5  |   17.30833   25.51947     0.68   0.499    -33.09011    67.70676
              6  |   50.88403   27.01934     1.88   0.061    -2.476513    104.2446
              7  |   79.24101   27.76837     2.85   0.005     24.40121    134.0808
              8  |   73.97942   28.25514     2.62   0.010      18.1783    129.7805
              9  |   56.66199   28.31439     2.00   0.047     .7438482    112.5801
             10  |   58.95475   28.56968     2.06   0.041     2.532449     115.377
             11  |   49.74642   28.40551     1.75   0.082    -6.351667    105.8445
             12  |   75.78518   28.05301     2.70   0.008     20.38324    131.1871
             13  |   57.76681   26.94116     2.14   0.034     4.560676    110.9729
             14  |   51.01013   26.04522     1.96   0.052    -.4266119    102.4469
             15  |   20.19058   24.97134     0.81   0.420    -29.12536    69.50651
             16  |   18.35979   23.75434     0.77   0.441     -28.5527    65.27228
             17  |   36.28607   21.77871     1.67   0.098    -6.724731    79.29688
             18  |   32.36863   18.82811     1.72   0.088    -4.815034    69.55229
             19  |   30.71474   14.03026     2.19   0.030     3.006356    58.42313
             20  |          0  (omitted)
                 |
           _cons |  -124.2761   10.59512   -11.73   0.000    -145.2004   -103.3518
    -------------+----------------------------------------------------------------
          rho_ar |  .68127018
         sigma_u |  96.029876
         sigma_e |  39.732352
         rho_fov |  .85383317   (fraction of variance because of u_i)
    ------------------------------------------------------------------------------
    F test that all u_i=0: F(9,160) = 12.27                      Prob > F = 0.0000
    
    . 
    . xtregar invest mvalue kstock ib12.time, fe
    
    FE (within) regression with AR(1) disturbances  Number of obs     =        190
    Group variable: company                         Number of groups  =         10
    
    R-squared:                                      Obs per group:
         Within  = 0.6513                                         min =         19
         Between = 0.7838                                         avg =       19.0
         Overall = 0.0867                                         max =         19
    
                                                    F(20,160)         =      14.94
    corr(u_i, Xb) = -0.0287                         Prob > F          =     0.0000
    
    ------------------------------------------------------------------------------
          invest | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
          mvalue |   .0938865   .0104378     8.99   0.000     .0732729    .1145001
          kstock |   .4076907   .0373936    10.90   0.000      .333842    .4815393
                 |
            time |
              1  |  -5412.661   2003.577    -2.70   0.008    -9369.529   -1455.794
              2  |  -3663.702   1357.974    -2.70   0.008    -6345.567   -981.8375
              3  |  -2481.169   918.5722    -2.70   0.008    -4295.259   -667.0792
              4  |  -1676.749    618.932    -2.71   0.007    -2899.079    -454.419
              5  |  -1145.771   415.1112    -2.76   0.006    -1965.575   -325.9673
              6  |  -740.3118   275.9892    -2.68   0.008    -1285.363   -195.2602
              7  |  -458.6017   181.4032    -2.53   0.012     -816.855   -100.3483
              8  |  -291.2613   117.0538    -2.49   0.014     -522.431   -60.09164
              9  |  -190.9902   73.08146    -2.61   0.010    -335.3189   -46.66152
             10  |  -108.5879   42.69178    -2.54   0.012    -192.8999   -24.27581
             11  |  -63.21994   21.45439    -2.95   0.004    -105.5902   -20.84964
             13  |   7.312056   14.24379     0.51   0.608    -20.81802    35.44214
             14  |   17.81225   18.37984     0.97   0.334    -18.48613    54.11063
             15  |  -1.250721   20.30361    -0.06   0.951    -41.34836    38.84691
             16  |   4.927904   20.95131     0.24   0.814    -36.44887    46.30468
             17  |   28.31076   20.20051     1.40   0.163    -11.58326    68.20479
             18  |   28.11072    18.0876     1.55   0.122    -7.610512    63.83195
             19  |   28.98939   13.81103     2.10   0.037     1.713972    56.26481
             20  |          0  (omitted)
                 |
           _cons |   -127.964   10.91499   -11.72   0.000      -149.52   -106.4079
    -------------+----------------------------------------------------------------
          rho_ar |  .68127018
         sigma_u |  96.029876
         sigma_e |  39.732352
         rho_fov |  .85383317   (fraction of variance because of u_i)
    ------------------------------------------------------------------------------
    F test that all u_i=0: F(9,160) = 12.27                      Prob > F = 0.0000

    Comment


    • #3
      Sorry, but this does not aswer my question and your example only shows that changing the covariates may change (of course) the interpretation of the constant. Obviously, the identification of the constant term in these models relies on the assumption that the fixed effects sum to zero (or some other identifiability restriction). However, once you make this assumption, the constant is identified/estimable and you want the right point estimates and the right SE. I know that this is meaningless anyway (at least when the panel is equally spaced), but this is not so meaningless in the case of unbalanced and unequally spaced panels. Here, the transformed constant is not constant anymore and so the ad-hoc approached used xtregar, fe may also bias the other coefficients in the conditional mean function. Let me illustrate these two points with some examples on the same dataset

      First example on balanced and equilly spaced panel, which leads to wrong SE on the constant term

      HTML Code:
      . use "https://www.stata-press.com/data/r18/grunfeld", clear
      . xtset
      Panel variable: company (strongly balanced)
       Time variable: year, 1935 to 1954
               Delta: 1 year
      
      . local ivar "company"
      . local tvar "year"
      . scalar delta=r(tdelta)
      . local y         "invest"
      . local X         "mvalue kstock"
      . gen double cons=1
      
      . * This is the official command
      . xtregar `y' `X', fe
      
      FE (within) regression with AR(1) disturbances  Number of obs     =        190
      Group variable: company                         Number of groups  =         10
      
      R-squared:                                      Obs per group:
           Within  = 0.5927                                         min =         19
           Between = 0.7989                                         avg =       19.0
           Overall = 0.7904                                         max =         19
      
                                                      F(2,178)          =     129.49
      corr(u_i, Xb) = -0.0454                         Prob > F          =     0.0000
      
      ------------------------------------------------------------------------------
            invest | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
      -------------+----------------------------------------------------------------
            mvalue |   .0949999   .0091377    10.40   0.000     .0769677     .113032
            kstock |    .350161   .0293747    11.92   0.000     .2921935    .4081286
             _cons |  -63.22022   5.648271   -11.19   0.000    -74.36641   -52.07402
      -------------+----------------------------------------------------------------
            rho_ar |  .67210608
           sigma_u |  91.507609
           sigma_e |  40.992469
           rho_fov |   .8328647   (fraction of variance because of u_i)
      ------------------------------------------------------------------------------
      F test that all u_i=0: F(9,178) = 11.53                      Prob > F = 0.0000
      
      . scalar rho=e(rho_ar)
      . local sigma=e(sigma_e)
      
      . * Transformations used by xtregar, fe
      . * AR(1) transform
      . sort `ivar' `tvar'                                              
      . qui by `ivar': gen double dif_t=(`tvar'[_n]-`tvar'[_n-1])/delta         if _n>1
      . local vlist "`y' cons `X'"              
      . foreach v of local vlist {                              
        2.         qui by `ivar': gen double AR1_`v'=(sqrt(1-rho^2))*`v'           if _n==1
        3.         qui by `ivar': replace AR1_`v'=(sqrt(1-rho^2))*                         ///
      >                 (`v'[_n]*(1/sqrt((1-rho^(2*dif_t)))) -                                  ///
      >                 `v'[_n-1]*(rho^(dif_t)/sqrt(1-rho^(2*dif_t))))                  if _n>1
        4. }
      
      . * Drop first obs by panel
      . sort `ivar' `tvar'                                              
      . qui by `ivar': drop if _n==1
      
      . * FE transform
      . local DMAR1_y                                                   
      . local DMAR1_X
      . local DMAR1_X_c
      . foreach v of local vlist {                              
        2.         qui bys `ivar': egen double MAR1_`v'=mean(AR1_`v')
        3.         sum AR1_`v', meanonly
        4.         qui gen double DMAR1_`v'=AR1_`v'-MAR1_`v' +r(mean)
        5.         drop AR1_`v' MAR1_`v'
        6.         if `:list v in y'       local DMAR1_y   "DMAR1_`v'"             
        7.         if `:list v in X'       local DMAR1_X   "`DMAR1_X' DMAR1_`v'"           
        8.         if `:list v in X'|"`v'"=="cons" local DMAR1_X_c "`DMAR1_X_c' DMAR1_`v'"         
        9. }
      
      .
      . * Reproducing xtregar, fe: OLS with "non-transformed" constant and
      . *       ex-post adjustments for both (1-rho) and dof
      . noi regress `DMAR1_y' `DMAR1_X'
      
            Source |       SS           df       MS      Number of obs   =       190
      -------------+----------------------------------   F(2, 187)       =    136.04
             Model |  435202.142         2  217601.071   Prob > F        =    0.0000
          Residual |  299108.082       187  1599.50846   R-squared       =    0.5927
      -------------+----------------------------------   Adj R-squared   =    0.5883
             Total |  734310.224       189  3885.23928   Root MSE        =    39.994
      
      ------------------------------------------------------------------------------
      DMAR1_invest | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
      -------------+----------------------------------------------------------------
      DMAR1_mvalue |   .0949999   .0089151    10.66   0.000     .0774128     .112587
      DMAR1_kstock |   .3501611   .0286591    12.22   0.000     .2936243    .4066978
             _cons |  -20.72953   5.510674    -3.76   0.000     -31.6006   -9.858447
      ------------------------------------------------------------------------------
      
      . noi di _n "Adj. mvalue:    b[mvalue]=" %9.8g _b[DMAR1_mvalue]                   /*
      >         */    "       SE[mvalue]="  %9.7g   _se[DMAR1_mvalue]*`sigma'/e(rmse)
      
      Adj. mvalue:    b[mvalue]= .0949999       SE[mvalue]= .0091377
      
      . noi di _n " Adj. const:    b[_cons]=" %9.8g _b[_cons]/(1-rho)                   /*
      >         */    "       wrong_SE[cons]="  %9.7g   _se[_cons]*`sigma'/e(rmse)      /*
      >         */    "        correct_SE[cons]="  %9.7g   _se[_cons]*`sigma'/(e(rmse)*(1-rho))
      
       Adj. const:    b[_cons]=-63.22022       wrong_SE[cons]= 5.648271        correct_SE[cons]= 17.22591
      
      . * My proposal: OLS with "transformed constant" and ex-post adjustment for dof only
      . noi regress `DMAR1_y' `DMAR1_X_c', nocons
      
            Source |       SS           df       MS      Number of obs   =       190
      -------------+----------------------------------   F(3, 187)       =    215.91
             Model |  1036025.68         3  345341.894   Prob > F        =    0.0000
          Residual |  299108.082       187  1599.50846   R-squared       =    0.7760
      -------------+----------------------------------   Adj R-squared   =    0.7724
             Total |  1335133.76       190  7027.01981   Root MSE        =    39.994
      
      ------------------------------------------------------------------------------
      DMAR1_invest | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
      -------------+----------------------------------------------------------------
        DMAR1_cons |  -63.22022   16.80627    -3.76   0.000    -96.37447   -30.06596
      DMAR1_mvalue |   .0949999   .0089151    10.66   0.000     .0774128     .112587
      DMAR1_kstock |   .3501611   .0286591    12.22   0.000     .2936243    .4066978
      ------------------------------------------------------------------------------
      
      . noi di _n "Adj. mvalue:    b[mvalue]=" %9.8g _b[DMAR1_mvalue]                                           /*
      >         */    "       SE[mvalue]="  %9.7g   _se[DMAR1_mvalue]*`sigma'/e(rmse)
      
      Adj. mvalue:    b[mvalue]= .0949999       SE[mvalue]= .0091377
      
      . noi di _n " Adj. const:    b[_cons]=" %9.8g _b[DMAR1_cons]                                              /*
      >         */    "       wrong_SE[cons]="  %9.7g   _se[DMAR1_cons]*(1-rho)*`sigma'/e(rmse) /*
      >         */    "        correct_SE[cons]="  %9.7g   _se[DMAR1_cons]*`sigma'/e(rmse)
      
       Adj. const:    b[_cons]=-63.22022       wrong_SE[cons]= 5.648271        correct_SE[cons]= 17.22591

      Now the second example on unbalanced and unequally spaced panels, which apparently leads to wrong estimates of all regression coefficients


      HTML Code:
      . use "https://www.stata-press.com/data/r18/grunfeld", clear
      
      . * Make the panel unbalanced and unequally spaced
      . * the rest of the code is identical to the first example
      . drop if `tvar'==1939|`ivar'>7 & (`tvar'==1944|`tvar'==1948)     
      (16 observations deleted)
      
      . xtset, clear
      . xtset `ivar' `tvar'
      
      Panel variable: company (unbalanced)
       Time variable: year, 1935 to 1954, but with gaps
               Delta: 1 year
      
      . scalar delta=r(tdelta)
      . local ivar "company"
      . local tvar "year"
      
      . scalar delta=r(tdelta)
      
      . local y         "invest"
      . local X         "mvalue kstock"
      . gen double cons=1
      .
      . * This is the official command
      . xtregar `y' `X', fe
      
      FE (within) regression with AR(1) disturbances  Number of obs     =        174
      Group variable: company                         Number of groups  =         10
      
      R-squared:                                      Obs per group:
           Within  = 0.6224                                         min =         16
           Between = 0.7971                                         avg =       17.4
           Overall = 0.7922                                         max =         18
      
                                                      F(2,162)          =     133.52
      corr(u_i, Xb) = -0.0564                         Prob > F          =     0.0000
      
      ------------------------------------------------------------------------------
            invest | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
      -------------+----------------------------------------------------------------
            mvalue |   .0979694   .0088802    11.03   0.000     .0804337    .1155052
            kstock |   .3516487   .0300888    11.69   0.000     .2922318    .4110657
             _cons |  -68.71156   5.902539   -11.64   0.000     -80.3674   -57.05572
      -------------+----------------------------------------------------------------
            rho_ar |  .67335537
           sigma_u |  94.027441
           sigma_e |   40.78577
           rho_fov |  .84164339   (fraction of variance because of u_i)
      ------------------------------------------------------------------------------
      F test that all u_i=0: F(9,162) = 11.68                      Prob > F = 0.0000
      
      . scalar rho=e(rho_ar)
      . local sigma=e(sigma_e)
      
      . * Transformations used by xtregar, fe
      . * AR(1) transform
      . sort `ivar' `tvar'                                              
      . qui by `ivar': gen double dif_t=(`tvar'[_n]-`tvar'[_n-1])/delta         if _n>1
      . local vlist "`y' cons `X'"              
      . foreach v of local vlist {                              
        2.         qui by `ivar': gen double AR1_`v'=(sqrt(1-rho^2))*`v'           if _n==1
        3.         qui by `ivar': replace AR1_`v'=(sqrt(1-rho^2))*                         ///
      >                 (`v'[_n]*(1/sqrt((1-rho^(2*dif_t)))) -                                  ///
      >                 `v'[_n-1]*(rho^(dif_t)/sqrt(1-rho^(2*dif_t))))                  if _n>1
        4. }
      
      . * Drop first obs by panel
      . sort `ivar' `tvar'                                              
      . qui by `ivar': drop if _n==1
      
      . * FE transform
      . local DMAR1_y                                                   
      . local DMAR1_X
      . local DMAR1_X_c
      . foreach v of local vlist {                              
        2.         qui bys `ivar': egen double MAR1_`v'=mean(AR1_`v')
        3.         sum AR1_`v', meanonly
        4.         qui gen double DMAR1_`v'=AR1_`v'-MAR1_`v' +r(mean)
        5.         drop AR1_`v' MAR1_`v'
        6.         if `:list v in y' local DMAR1_y "DMAR1_`v'"             
        7.         if `:list v in X' local DMAR1_X "`DMAR1_X' DMAR1_`v'"           
        8.         if `:list v in X'|"`v'"=="cons" local DMAR1_X_c "`DMAR1_X_c' DMAR1_`v'"         
        9. }
      
      . * Reproducing xtregar, fe: OLS with "non-transformed" constant and
      . *       ex-post adjustments for both (1-rho) and dof
      . noi regress `DMAR1_y' `DMAR1_X'
      
            Source |       SS           df       MS      Number of obs   =       174
      -------------+----------------------------------   F(2, 171)       =    140.93
             Model |  444202.784         2  222101.392   Prob > F        =    0.0000
          Residual |  269483.601       171  1575.92749   R-squared       =    0.6224
      -------------+----------------------------------   Adj R-squared   =    0.6180
             Total |  713686.385       173  4125.35483   Root MSE        =    39.698
      
      ------------------------------------------------------------------------------
      DMAR1_invest | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
      -------------+----------------------------------------------------------------
      DMAR1_mvalue |   .0979694   .0086433    11.33   0.000     .0809081    .1150307
      DMAR1_kstock |   .3516487   .0292863    12.01   0.000     .2938394     .409458
             _cons |  -22.44426    5.74511    -3.91   0.000    -33.78473   -11.10379
      ------------------------------------------------------------------------------
      
      . noi di _n "Adj. mvalue:    b[mvalue]=" %9.8g _b[DMAR1_mvalue]                   /*
      >         */    "       SE[mvalue]="  %9.7g   _se[DMAR1_mvalue]*`sigma'/e(rmse)
      
      Adj. mvalue:    b[mvalue]= .0979694       SE[mvalue]= .0088802
      
      . noi di _n " Adj. const:    b[_cons]=" %9.8g _b[_cons]/(1-rho)                   /*
      >         */    "       wrong_SE[cons]="  %9.7g   _se[_cons]*`sigma'/e(rmse)      /*
      >         */    "        correct_SE[cons]="  %9.7g   _se[_cons]*`sigma'/(e(rmse)*(1-rho))
      
       Adj. const:    b[_cons]=-68.71156       wrong_SE[cons]= 5.902539        correct_SE[cons]= 18.07022
      
      . * My proposal: OLS with "transformed constant" and ex-post adjustment for dof only
      . noi regress `DMAR1_y' `DMAR1_X_c', nocons
      
            Source |       SS           df       MS      Number of obs   =       174
      -------------+----------------------------------   F(3, 171)       =    225.73
             Model |  1065081.33         3   355027.11   Prob > F        =    0.0000
          Residual |  268949.328       171  1572.80309   R-squared       =    0.7984
      -------------+----------------------------------   Adj R-squared   =    0.7949
             Total |  1334030.66       174  7666.84286   Root MSE        =    39.659
      
      ------------------------------------------------------------------------------
      DMAR1_invest | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
      -------------+----------------------------------------------------------------
        DMAR1_cons |  -66.45554   16.80825    -3.95   0.000    -99.63391   -33.27716
      DMAR1_mvalue |   .0987793    .008723    11.32   0.000     .0815606    .1159979
      DMAR1_kstock |   .3497412    .028857    12.12   0.000     .2927794    .4067031
      ------------------------------------------------------------------------------
      
      . noi di _n "Adj. mvalue:    b[mvalue]=" %9.8g _b[DMAR1_mvalue]                                           /*
      >         */    "       SE[mvalue]="  %9.7g   _se[DMAR1_mvalue]*`sigma'/e(rmse)
      
      Adj. mvalue:    b[mvalue]= .0987793       SE[mvalue]= .0089709
      
      . noi di _n " Adj. const:    b[_cons]=" %9.8g _b[DMAR1_cons]                                              /*
      >         */    "       wrong_SE[cons]="  %9.7g   _se[DMAR1_cons]*(1-rho)*`sigma'/e(rmse) /*
      >         */    "        correct_SE[cons]="  %9.7g   _se[DMAR1_cons]*`sigma'/e(rmse)
      
       Adj. const:    b[_cons]=-66.45554       wrong_SE[cons]= 5.646372        correct_SE[cons]= 17.28598
      You see that all estimated coefficients are now different in the two approaches (i.e. "non-transformed constant" and "transformed constant"). Which approach is correct?

      Comment


      • #4
        I do not agree that in the case of an unbalanced panel, the estimate of the constant is meaningful. Second, if you want to do the transformation and estimate using regress, then you must take into account that your variables after transformation are generated regressors, and there needs to be a degrees of freedom adjustment. I do not see that you apply such an adjustment in your illustration, but I may just be missing it. As I personally have no use for the estimate of the constant in FE models, if you believe that you have a case, compile a reproducible example and send it to Stata Technical Services (see https://www.stata.com/support/tech-support/contact/). If your illustrations have merit, they will let you know and make the appropriate adjustments to the command. You should update this thread for the benefit of users of the command once you hear back from them.
        Last edited by Andrew Musau; 24 Nov 2023, 16:43.

        Comment


        • #5
          Thank you Andrew, writing to the Stata Tchnical Services is a very good idea, also becasue this is a technical issue. Anyway, my code always adjusts the SE for the degree of freedom in the last diplay command:
          Code:
           `sigma'/e(rmse)
          In fact, I showed in both examples how to reproduce exactly the point estimates and SE of the constant and the coefficient of "mvalue" in the output of xtregar, fe. The question is whether this command is correct or not, because different approaches for the constant term may lead to different estimates of the other regression coefficients (irrespective of whether you think that the constant is meaningful or not). Thank you again for your advise.

          Comment


          • #6
            As I said, send the query to Tech Support. The whole thing looks backwards to me. You first run xtregar to obtain estimates of rho and sigma_e, do the transformation, run regress and then adjust the regress estimates using the estimates you got from the same command that you claim gets the calculations wrong. To make a compelling case, do everything outside xtregar and then show your estimates are different from those of xtregar.

            Comment

            Working...
            X