Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • To "xi" or not to "xi"?

    Dear All, I know that, in general, whether to add prefix such as "xi" does not matter for estimation. However, I was asked a question which adding "xi" or not does alter the results. The data is here:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long id double year long industry double(y x)
     2 2004 63  .756728  .8355138047833333
     2 2005 63  .751015  .6496188354333334
     2 2006 63 1.385951  .7327375125666667
     2 2007 63 1.898914        .8224521987
     2 2008 63  .581806      1.79040523525
     2 2009 63  .842814 1.2762391918083331
     2 2010 63  .419548  .9888817151916668
     2 2011 63  .272164 1.7063635349166666
     2 2012 63  .294346 2.4439829506666664
     2 2013 63  .228051 1.1389735984916667
     2 2014 63   .30241     1.236348807075
     2 2015 63  .442573  1.812866477783333
     2 2016 63  .273578  3.648327967416666
     2 2017 63  .294642  3.638739865583334
     4 2004  .        .                  .
     4 2005  .        .                  .
     4 2006  .        .                  .
     4 2007  .        .                  .
     4 2008  .        .                  .
     4 2009  .        .                  .
     4 2010  .        .                  .
     4 2011  .        .                  .
     4 2012  .        .                  .
     4 2013  .        .                  .
     4 2014  .        .                  .
     4 2015  .        .                  .
     4 2016  .        .                  .
     4 2017  .        .                  .
     5 2004  .        .                  .
     5 2005  .        .                  .
     5 2006  .        .                  .
     5 2007  .        .                  .
     5 2008  .        .                  .
     5 2009  .        .                  .
     5 2010  .        .                  .
     5 2011  .        .                  .
     5 2012  .        .                  .
     5 2013  .        .                  .
     5 2014  .        .                  .
     5 2015  .        .                  .
     5 2016  .        .                  .
     5 2017  .        .                  .
     6 2004 63   .29287  .8355138047833333
     6 2005 63  .505632  .6496188354333334
     6 2006 63 1.798777  .7327375125666667
     6 2007 63 1.198202        .8224521987
     6 2008 63  .422264      1.79040523525
     6 2009 63  .790204 1.2762391918083331
     6 2010 63  .618863  .9888817151916668
     6 2011 63  .488526 1.7063635349166666
     6 2012 63  .682689 2.4439829506666664
     6 2013 63  .669052 1.1389735984916667
     6 2014 63  .809775     1.236348807075
     6 2015 63 1.223162  1.812866477783333
     6 2016 63  .968647  3.648327967416666
     6 2017  .        .  3.638739865583334
     7 2004  .        .                  .
     7 2005  .        .                  .
     7 2006  .        .                  .
     7 2007  .        .                  .
     7 2008  .        .                  .
     7 2009  .        .                  .
     7 2010  .        .                  .
     7 2011  .        .                  .
     7 2012  .        .                  .
     7 2013  .        .                  .
     7 2014  .        .                  .
     7 2015  .        .                  .
     7 2016  .        .                  .
     7 2017  .        .                  .
     8 2004  .        .                  .
     8 2005  .        .                  .
     8 2006  .        .                  .
     8 2007  .        .                  .
     8 2008  .        .                  .
     8 2009  .        .                  .
     8 2010  .        .                  .
     8 2011  .        .                  .
     8 2012  .        .                  .
     8 2013  .        .                  .
     8 2014  .        .                  .
     8 2015  .        .                  .
     8 2016  .        .                  .
     8 2017  .        .                  .
     9 2004 79  .676721  .8355138047833333
     9 2005 79  .467796  .6496188354333334
     9 2006 79   .75409  .7327375125666667
     9 2007 79 2.684242        .8224521987
     9 2008 79  .701192      1.79040523525
     9 2009 79 1.664904 1.2762391918083331
     9 2010 79 1.864633  .9888817151916668
     9 2011 79 1.034766 1.7063635349166666
     9 2012 79   .75898 2.4439829506666664
     9 2013 79  .911308 1.1389735984916667
     9 2014 79 1.398672     1.236348807075
     9 2015 79  1.65417  1.812866477783333
     9 2016 79 1.099003  3.648327967416666
     9 2017 79  .610722  3.638739865583334
    10 2004  .        .                  .
    10 2005  .        .                  .
    10 2006  .        .                  .
    10 2007  .        .                  .
    10 2008  .        .                  .
    10 2009  .        .                  .
    10 2010  .        .                  .
    10 2011  .        .                  .
    10 2012  .        .                  .
    10 2013  .        .                  .
    10 2014  .        .                  .
    10 2015  .        .                  .
    10 2016  .        .                  .
    10 2017  .        .                  .
    11 2004  .        .                  .
    11 2005  .        .                  .
    11 2006  .        .                  .
    11 2007  .        .                  .
    11 2008  .        .                  .
    11 2009  .        .                  .
    11 2010  .        .                  .
    11 2011  .        .                  .
    11 2012  .        .                  .
    11 2013  .        .                  .
    11 2014  .        .                  .
    11 2015  .        .                  .
    11 2016  .        .                  .
    11 2017  .        .                  .
    12 2004 28 1.032094  .8355138047833333
    12 2005 28  .702338  .6496188354333334
    12 2006 28 1.422744  .7327375125666667
    12 2007 28 2.440828        .8224521987
    12 2008 28  .811734      1.79040523525
    12 2009 28 1.779759 1.2762391918083331
    12 2010 28 2.745457  .9888817151916668
    12 2011 28 1.053197 1.7063635349166666
    12 2012 28  1.05299 2.4439829506666664
    12 2013 28 1.019698 1.1389735984916667
    12 2014 28 1.054209     1.236348807075
    12 2015 28 1.492859  1.812866477783333
    12 2016 28 1.211967  3.648327967416666
    12 2017 28  .946029  3.638739865583334
    14 2004 63  .430889  .8355138047833333
    14 2005 63  .594332  .6496188354333334
    14 2006 63  .945943  .7327375125666667
    14 2007  .        .        .8224521987
    14 2008 63  .890433      1.79040523525
    14 2009 63 1.794807 1.2762391918083331
    14 2010 63  1.16601  .9888817151916668
    14 2011 63  .549908 1.7063635349166666
    14 2012 63  .846552 2.4439829506666664
    14 2013 63 1.093508 1.1389735984916667
    14 2014 63 1.244544     1.236348807075
    14 2015 63 2.120411  1.812866477783333
    14 2016 63  2.23855  3.648327967416666
    14 2017 63 1.423093  3.638739865583334
    16 2004 37  .306983  .8355138047833333
    16 2005 37  .228051  .6496188354333334
    16 2006 37  .232354  .7327375125666667
    16 2007 37  .494927        .8224521987
    16 2008 37   .29909      1.79040523525
    16 2009 37  .559564 1.2762391918083331
    16 2010 37  .298449  .9888817151916668
    16 2011 37  .228051 1.7063635349166666
    16 2012 37  .228051 2.4439829506666664
    16 2013 37  .257594 1.1389735984916667
    16 2014 37   .35575     1.236348807075
    16 2015 37  .983892  1.812866477783333
    16 2016 37  .546826  3.648327967416666
    16 2017 37  .504983  3.638739865583334
    17 2004  .        .                  .
    17 2005  .        .                  .
    17 2006  .        .                  .
    17 2007  .        .                  .
    17 2008  .        .                  .
    17 2009  .        .                  .
    17 2010  .        .                  .
    17 2011  .        .                  .
    17 2012  .        .                  .
    17 2013  .        .                  .
    17 2014  .        .                  .
    17 2015  .        .                  .
    17 2016  .        .                  .
    17 2017  .        .                  .
    18 2004  .        .                  .
    18 2005  .        .                  .
    18 2006  .        .                  .
    18 2007  .        .                  .
    18 2008  .        .                  .
    18 2009  .        .                  .
    18 2010  .        .                  .
    18 2011  .        .                  .
    18 2012  .        .                  .
    18 2013  .        .                  .
    18 2014  .        .                  .
    18 2015  .        .                  .
    18 2016  .        .                  .
    18 2017  .        .                  .
    19 2004 14 2.209158  .8355138047833333
    19 2005 14 1.927555  .6496188354333334
    19 2006 14 2.130961  .7327375125666667
    19 2007 14 8.873074        .8224521987
    19 2008 14 2.068874      1.79040523525
    19 2009 14 3.652264 1.2762391918083331
    19 2010 14 3.845202  .9888817151916668
    19 2011 14 2.133376 1.7063635349166666
    19 2012 14 1.965606 2.4439829506666664
    19 2013 14 1.616796 1.1389735984916667
    19 2014 14 2.345149     1.236348807075
    19 2015 14 5.605507  1.812866477783333
    19 2016 14 6.972619  3.648327967416666
    19 2017  .        .  3.638739865583334
    20 2004  .        .                  .
    20 2005  .        .                  .
    20 2006  .        .                  .
    20 2007  .        .                  .
    20 2008  .        .                  .
    20 2009  .        .                  .
    20 2010  .        .                  .
    20 2011  .        .                  .
    20 2012  .        .                  .
    20 2013  .        .                  .
    20 2014  .        .                  .
    20 2015  .        .                  .
    20 2016  .        .                  .
    20 2017  .        .                  .
    21 2004 37 1.709002  .8355138047833333
    21 2005 37 1.373025  .6496188354333334
    21 2006 37 1.841413  .7327375125666667
    21 2007 37 2.871234        .8224521987
    21 2008 37  .843067      1.79040523525
    21 2009 37 2.093308 1.2762391918083331
    21 2010 37 2.327557  .9888817151916668
    21 2011 37  .658607 1.7063635349166666
    21 2012 37  .613905 2.4439829506666664
    21 2013 37  .569941 1.1389735984916667
    21 2014 37  .738986     1.236348807075
    21 2015 37 1.251923  1.812866477783333
    21 2016 37 1.200638  3.648327967416666
    21 2017 37  .875048  3.638739865583334
    22 2004 55 2.963965  .8355138047833333
    22 2005 55  2.06641  .6496188354333334
    22 2006 55 2.817894  .7327375125666667
    22 2007 55 3.534271        .8224521987
    22 2008 55 1.347176      1.79040523525
    22 2009 55  2.04253 1.2762391918083331
    22 2010 55 1.684386  .9888817151916668
    22 2011 55 1.012511 1.7063635349166666
    22 2012 53 1.085851 2.4439829506666664
    22 2013 53 1.478983 1.1389735984916667
    22 2014 53 2.123118     1.236348807075
    22 2015 53 1.933791  1.812866477783333
    end
    label values industry industry
    label def industry 14 "C15", modify
    label def industry 28 "C30", modify
    label def industry 37 "C39", modify
    label def industry 53 "G55", modify
    label def industry 55 "G58", modify
    label def industry 63 "K70", modify
    label def industry 79 "S90", modify
    I estimate the following pairs of regressions:
    Code:
    xtset id year
    tab year, gen(dyear)
    
    // L.x (not OK)
    xtreg y L.x i.year i.industry, fe cluster(id)
    xi: xtreg y L.x i.year i.industry, fe cluster(id)
    You can find their outcomes are different. I believe it is caused by the inclusion of different year dummies. But why is this happening? The following is OK, though.
    Code:
    // L.x (OK)
    xtreg y L.x dyear* i.industry, fe cluster(id)
    xi: xtreg y L.x dyear* i.industry, fe cluster(id)
    Ho-Chuan (River) Huang
    Stata 17.0, MP(4)

  • #2
    Also note that it is OK in the following case.
    Code:
    webuse grunfeld, clear
    xtset company year
    xtreg invest L.mvalue i.year, fe cluster(company)
    xi: xtreg invest L.mvalue i.year, fe cluster(company)
    Since the "grunfeld" is balanced in nature, I doubt my question above is related to missing values as well.
    Ho-Chuan (River) Huang
    Stata 17.0, MP(4)

    Comment


    • #3
      You answered your own question: the results differ because of differing patterns of which variables get omitted due to colinearity.

      The important thing to understand, however, is that the model as a whole is unidentified. The omission of some of the variables is how the model becomes identifiable and estimates made. Regardless of which variables are omitted for this purpose, the model itself, is unchanged, and, importantly, the model's predictions are unchanged. Notice, for example, that all three of the R2 statistics for the two models are the same. So are the estimates of sigma_u, sigma_e, and rho. And, if you calculate the predicted outcomes for observations, they too, are identical:

      Code:
      xtreg y L.x i.year i.industry, fe cluster(id)
      predict yhat1, xbu
      xi: xtreg y L.x i.year i.industry, fe cluster(id)
      predict yhat2, xbu
      
      assert yhat1 == yhat2
      The point is that things like the coefficients are not identifiable. There is no way to say that one of these sets of results is right and the other is wrong. They are both "wrong" in the sense that they are showing you numbers that are artifacts of how the colinearities were resolved, and the parameters you would like to think they represent are unidentifiable. So none of those are meaningful. But aggregate statistics for the model as a whole,are identifiable, and come out the same either way.

      So it really has nothing to do with -xi- per se. It's just that when you use -xi-, Stata selects different variables to omit than when you use factor variable notation. But if you modified your factor variable notation by specifying different base values for the variables, you would see similar changes. It all boils down to the unidentifiablity of the model.

      Comment


      • #4
        Dear Clyde, Thanks, and I got your point.
        Ho-Chuan (River) Huang
        Stata 17.0, MP(4)

        Comment


        • #5
          Dear Clyde, On second thought, it seems to me that the omission of some of the variables (dummies) can alter the intercept but should not influence the slope coefficient. Any comments?
          Ho-Chuan (River) Huang
          Stata 17.0, MP(4)

          Comment


          • #6
            The results are:
            Code:
            . // L.x (not OK)
            . xtreg y L.x i.year, fe cluster(id)
            note: 2017.year omitted because of collinearity
            
            Fixed-effects (within) regression               Number of obs     =        112
            Group variable: id                              Number of groups  =          9
            
            R-sq:                                           Obs per group:
                 within  = 0.4256                                         min =         11
                 between = 0.0178                                         avg =       12.4
                 overall = 0.2237                                         max =         13
            
                                                            F(8,8)            =          .
            corr(u_i, Xb)  = 0.0058                         Prob > F          =          .
            
                                                 (Std. Err. adjusted for 9 clusters in id)
            ------------------------------------------------------------------------------
                         |               Robust
                       y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
                       x |
                     L1. |   .0616659   .0852046     0.72   0.490    -.1348163     .258148
                         |
                    year |
                   2006  |   .5352382   .1280754     4.18   0.003     .2398957    .8305806
                   2007  |   2.045066   .7665683     2.67   0.028     .2773564    3.812776
                   2008  |  -.0714743   .1234297    -0.58   0.578    -.3561038    .2131551
                   2009  |   .6748936   .1653097     4.08   0.004     .2936888    1.056098
                   2010  |   .6788168    .311582     2.18   0.061    -.0396925    1.397326
                   2011  |  -.1411296   .1775224    -0.79   0.450    -.5504969    .2682378
                   2012  |  -.1744999   .1405368    -1.24   0.250    -.4985783    .1495785
                   2013  |  -.1848791   .1607079    -1.15   0.283    -.5554721    .1857139
                   2014  |    .176449   .1605802     1.10   0.304    -.1938495    .5467475
                   2015  |   .8744082     .39842     2.19   0.059      -.04435    1.793166
                   2016  |   .8635172   .5680503     1.52   0.167    -.4464092    2.173444
                   2017  |          0  (omitted)
                         |
                   _cons |   .8822517   .2683177     3.29   0.011       .26351    1.500993
            -------------+----------------------------------------------------------------
                 sigma_u |  .93546528
                 sigma_e |    .758637
                     rho |  .60325381   (fraction of variance due to u_i)
            ------------------------------------------------------------------------------
            
            . xi: xtreg y L.x i.year, fe cluster(id)
            i.year            _Iyear_2004-2017    (naturally coded; _Iyear_2004 omitted)
            note: _Iyear_2016 omitted because of collinearity
            note: _Iyear_2017 omitted because of collinearity
            
            Fixed-effects (within) regression               Number of obs     =        112
            Group variable: id                              Number of groups  =          9
            
            R-sq:                                           Obs per group:
                 within  = 0.4256                                         min =         11
                 between = 0.0178                                         avg =       12.4
                 overall = 0.2237                                         max =         13
            
                                                            F(8,8)            =          .
            corr(u_i, Xb)  = 0.0058                         Prob > F          =          .
            
                                                 (Std. Err. adjusted for 9 clusters in id)
            ------------------------------------------------------------------------------
                         |               Robust
                       y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
                       x |
                     L1. |  -.4087974   .2462526    -1.66   0.135     -.976657    .1590622
                         |
             _Iyear_2005 |  -1.323326   .8705276    -1.52   0.167    -3.330766    .6841146
             _Iyear_2006 |  -.8755444   .9662218    -0.91   0.391    -3.103656    1.352567
             _Iyear_2007 |   .6733878   .2789429     2.41   0.042     .0301443    1.316631
             _Iyear_2008 |  -1.400945   .8391488    -1.67   0.134    -3.336026    .5341356
             _Iyear_2009 |  -.1991908   .5065571    -0.39   0.704    -1.367314    .9689319
             _Iyear_2010 |  -.4371638   .6537928    -0.67   0.523    -1.944813    1.070485
             _Iyear_2011 |  -1.392301   .7934816    -1.75   0.117    -3.222073    .4374706
             _Iyear_2012 |  -1.088123   .6329958    -1.72   0.124    -2.547814    .3715681
             _Iyear_2013 |  -.7514792   .4926841    -1.53   0.166    -1.887611    .3846524
             _Iyear_2014 |   -1.00411   .7458202    -1.35   0.215    -2.723975    .7157544
             _Iyear_2015 |  -.2603394   .3940619    -0.66   0.527    -1.169048    .6483689
             _Iyear_2016 |          0  (omitted)
             _Iyear_2017 |          0  (omitted)
                   _cons |   2.598656   .9004006     2.89   0.020     .5223285    4.674984
            -------------+----------------------------------------------------------------
                 sigma_u |  .93546528
                 sigma_e |    .758637
                     rho |  .60325381   (fraction of variance due to u_i)
            ------------------------------------------------------------------------------
            Ho-Chuan (River) Huang
            Stata 17.0, MP(4)

            Comment


            • #7
              The difference in results is coming from your data. Because of missing values, you are not estimating with the same data set in the two cases. I simplified the estimation commands to remove inessentials and obtained the same resulats as you did.
              Code:
              . tab year, gen(yr)
              
                     year |      Freq.     Percent        Cum.
              ------------+-----------------------------------
                     2004 |         18        7.20        7.20
                     2005 |         18        7.20       14.40
                     2006 |         18        7.20       21.60
                     2007 |         18        7.20       28.80
                     2008 |         18        7.20       36.00
                     2009 |         18        7.20       43.20
                     2010 |         18        7.20       50.40
                     2011 |         18        7.20       57.60
                     2012 |         18        7.20       64.80
                     2013 |         18        7.20       72.00
                     2014 |         18        7.20       79.20
                     2015 |         18        7.20       86.40
                     2016 |         17        6.80       93.20
                     2017 |         17        6.80      100.00
              ------------+-----------------------------------
                    Total |        250      100.00
              
              . xtreg y L.x yr2-yr12, fe
              
              Fixed-effects (within) regression               Number of obs     =        112
              Group variable: id                              Number of groups  =          9
              
              R-sq:                                           Obs per group:
                   within  = 0.4256                                         min =         11
                   between = 0.0178                                         avg =       12.4
                   overall = 0.2237                                         max =         13
              
                                                              F(12,91)          =       5.62
              corr(u_i, Xb)  = 0.0058                         Prob > F          =     0.0000
              
              ------------------------------------------------------------------------------
                         y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
              -------------+----------------------------------------------------------------
                         x |
                       L1. |  -.4087974   .2245499    -1.82   0.072    -.8548381    .0372433
                           |
                       yr2 |  -1.323326   .5112663    -2.59   0.011    -2.338893    -.307758
                       yr3 |  -.8755444   .5440859    -1.61   0.111    -1.956304    .2052155
                       yr4 |   .6733878   .5372659     1.25   0.213    -.3938249    1.740601
                       yr5 |  -1.400945     .51353    -2.73   0.008    -2.421009   -.3808808
                       yr6 |  -.1991908   .3720255    -0.54   0.594    -.9381737    .5397921
                       yr7 |  -.4371638   .4395471    -0.99   0.323     -1.31027    .4359424
                       yr8 |  -1.392301   .4852227    -2.87   0.005    -2.356137   -.4284661
                       yr9 |  -1.088123   .3814934    -2.85   0.005    -1.845913    -.330333
                      yr10 |  -.7514792   .3274588    -2.29   0.024    -1.401936   -.1010226
                      yr11 |   -1.00411   .4608035    -2.18   0.032     -1.91944   -.0887806
                      yr12 |  -.2603394   .4456091    -0.58   0.561    -1.145487    .6248083
                     _cons |   2.598656   .6168557     4.21   0.000     1.373348    3.823964
              -------------+----------------------------------------------------------------
                   sigma_u |  .93546528
                   sigma_e |    .758637
                       rho |  .60325381   (fraction of variance due to u_i)
              ------------------------------------------------------------------------------
              F test that all u_i=0: F(8, 91) = 18.37                      Prob > F = 0.0000
              
              . xtreg y L.x yr3-yr13, fe
              
              Fixed-effects (within) regression               Number of obs     =        112
              Group variable: id                              Number of groups  =          9
              
              R-sq:                                           Obs per group:
                   within  = 0.4256                                         min =         11
                   between = 0.0178                                         avg =       12.4
                   overall = 0.2237                                         max =         13
              
                                                              F(12,91)          =       5.62
              corr(u_i, Xb)  = 0.0058                         Prob > F          =     0.0000
              
              ------------------------------------------------------------------------------
                         y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
              -------------+----------------------------------------------------------------
                         x |
                       L1. |   .0616659   .1433616     0.43   0.668    -.2231044    .3464361
                           |
                       yr3 |   .5352382   .3702138     1.45   0.152    -.2001461    1.270622
                       yr4 |   2.045066   .3760629     5.44   0.000     1.298063    2.792069
                       yr5 |  -.0714743   .3584592    -0.20   0.842    -.7835094    .6405608
                       yr6 |   .6748936   .3212754     2.10   0.038     .0367194    1.313068
                       yr7 |   .6788168   .3344374     2.03   0.045      .014498    1.343136
                       yr8 |  -.1411296    .348433    -0.41   0.686    -.8332489    .5509897
                       yr9 |  -.1744999   .3223129    -0.54   0.590     -.814735    .4657351
                      yr10 |  -.1848791    .328532    -0.56   0.575    -.8374675    .4677094
                      yr11 |    .176449   .3405734     0.52   0.606    -.5000582    .8529562
                      yr12 |   .8744082   .3361134     2.60   0.011     .2067602    1.542056
                      yr13 |   .8635172   .3336195     2.59   0.011     .2008231    1.526211
                     _cons |   .8822517   .3411809     2.59   0.011     .2045377    1.559966
              -------------+----------------------------------------------------------------
                   sigma_u |  .93546528
                   sigma_e |    .758637
                       rho |  .60325381   (fraction of variance due to u_i)
              ------------------------------------------------------------------------------
              F test that all u_i=0: F(8, 91) = 18.37                      Prob > F = 0.0000

              Comment


              • #8
                On edit: If I do the same thing with the grunfeld data set, I get identical results for the coefficients on mvalue and kstock with invest as dependent variable.

                Comment


                • #9
                  Maybe the following example helps:

                  Code:
                  input float(y x dummy)
                  1 1 0
                  2 1 0
                  3 . 1
                  4 . 1
                  end
                  
                  regress y x dummy

                  Here, the values of x are missing when the dummy is positive. Because of listwise deletion of missing values, you effectively have a zero variable for the dummy variable. With a constant in the model, the x variable will additionally be omitted because of collinearity.

                  Code:
                  . regress y x dummy
                  note: x omitted because of collinearity
                  note: dummy omitted because of collinearity
                  
                        Source |       SS           df       MS      Number of obs   =         2
                  -------------+----------------------------------   F(0, 1)         =      0.00
                         Model |           0         0           .   Prob > F        =         .
                      Residual |          .5         1          .5   R-squared       =    0.0000
                  -------------+----------------------------------   Adj R-squared   =    0.0000
                         Total |          .5         1          .5   Root MSE        =    .70711
                  
                  ------------------------------------------------------------------------------
                             y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                  -------------+----------------------------------------------------------------
                             x |          0  (omitted)
                         dummy |          0  (omitted)
                         _cons |        1.5         .5     3.00   0.205    -4.853102    7.853102
                  ------------------------------------------------------------------------------
                  You can verify that you have something similar to the above

                  Code:
                  . tab year, gen(yr)
                  
                         year |      Freq.     Percent        Cum.
                  ------------+-----------------------------------
                         2004 |         18        7.20        7.20
                         2005 |         18        7.20       14.40
                         2006 |         18        7.20       21.60
                         2007 |         18        7.20       28.80
                         2008 |         18        7.20       36.00
                         2009 |         18        7.20       43.20
                         2010 |         18        7.20       50.40
                         2011 |         18        7.20       57.60
                         2012 |         18        7.20       64.80
                         2013 |         18        7.20       72.00
                         2014 |         18        7.20       79.20
                         2015 |         18        7.20       86.40
                         2016 |         17        6.80       93.20
                         2017 |         17        6.80      100.00
                  ------------+-----------------------------------
                        Total |        250      100.00
                  
                  . gen lx=L.x
                  (135 missing values generated)
                  
                  . list yr1 lx if yr1==1
                  
                       +----------+
                       | yr1   lx |
                       |----------|
                    1. |   1    . |
                   15. |   1    . |
                   29. |   1    . |
                   43. |   1    . |
                   57. |   1    . |
                       |----------|
                   71. |   1    . |
                   85. |   1    . |
                   99. |   1    . |
                  113. |   1    . |
                  127. |   1    . |
                       |----------|
                  141. |   1    . |
                  155. |   1    . |
                  169. |   1    . |
                  183. |   1    . |
                  197. |   1    . |
                       |----------|
                  211. |   1    . |
                  225. |   1    . |
                  239. |   1    . |
                       +----------+
                  The default for factor variables in Stata is omit the minimum category. Stata thus recognizes that x1 in your data is 0 and resorts to designate 2005 (the second year) as the base, but still runs into collinearity issues in estimating the other dummies.

                  Code:
                  qui xtreg y L.x i.year, fe cluster(id)
                  mat list e(b)
                  
                  . mat list e(b)
                  
                  e(b)[1,15]
                               L.      2005b.       2006.       2007.       2008.       2009.       2010.       2011.       2012.       2013.
                               x        year        year        year        year        year        year        year        year        year
                  y1   .06166586           0   .53523815    2.045066  -.07147432   .67489357   .67881684  -.14112956  -.17449992  -.18487908
                  
                            2014.       2015.       2016.      2017o.            
                            year        year        year        year       _cons
                  y1     .176449   .87440816   .86351722           0   .88225171
                  On the other hand, by choosing a different base in your data and omitting x1, it is possible not to run into collinearity issues and that is what happens when using the xi prefex or generating dummies by hand. Bottom line is that you need \(T-1\) dummies for years, and with one dummy effectively equal to 0, there is a problem with your data (or defined sample). reghdfe (SSC, by Sergio Correia) which does within estimation (demeaning) would have immediately signaled this to you.

                  Code:
                  . reghdfe y L.x, absorb(id year) cluster(id)
                  (converged in 6 iterations)
                  note: L.x omitted because of collinearity
                  
                  HDFE Linear regression                            Number of obs   =        112
                  Absorbing 2 HDFE groups                           F(   0,      8) =       0.00
                  Statistics robust to heteroskedasticity           Prob > F        =          .
                                                                    R-squared       =     0.7048
                                                                    Adj R-squared   =     0.6399
                                                                    Within R-sq.    =     0.0000
                  Number of clusters (id)      =          9         Root MSE        =     0.7586
                  
                                                       (Std. Err. adjusted for 9 clusters in id)
                  ------------------------------------------------------------------------------
                               |               Robust
                             y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                  -------------+----------------------------------------------------------------
                             x |
                           L1. |          0  (omitted)
                  ------------------------------------------------------------------------------
                  
                  Absorbed degrees of freedom:
                  ---------------------------------------------------------------+
                   Absorbed FE |  Num. Coefs.  =   Categories  -   Redundant     | 
                  -------------+-------------------------------------------------|
                            id |            0               9              9 *   | 
                          year |           12              13              1     | 
                  ---------------------------------------------------------------+
                  * = fixed effect nested within cluster; treated as redundant for DoF computation

                  Comment


                  • #10
                    Dear Eric, Thanks for the examples. I think they are quite similar to those I mentioned above. Say,
                    Code:
                    // L.x (OK)
                    xtreg y L.x dyear* i.industry, fe cluster(id)
                    xi: xtreg y L.x dyear* i.industry, fe cluster(id)
                    Ho-Chuan (River) Huang
                    Stata 17.0, MP(4)

                    Comment


                    • #11
                      Dear Andrew, Many thanks for this interesting example.
                      Ho-Chuan (River) Huang
                      Stata 17.0, MP(4)

                      Comment


                      • #12
                        Dear River Huang, in reply to your post #10, my point was that it has nothing to do with the use of -xi- or -i.-. Your use of -xi- in one case and of -i.- in the other gave that impression.

                        Comment


                        • #13
                          Dear Eric, Thanks. I think that I agree with your explanation that, due to different samples used in the estimation, the results with "xi" are not identical to those without "xi". However, I did not see how to delete "something" to make them identical. Any suggestions?
                          Ho-Chuan (River) Huang
                          Stata 17.0, MP(4)

                          Comment


                          • #14
                            River Huang - I do not see what the point is because the coefficient of lagged x is not identified because it is collinear with the time dummies. As I show in #9, use of year dummies masks this collinearity, but by using reghdfe, you immediately see that you cannot obtain an estimate. To replicate the regressions using year dummies, just specify the base level in the factor variable regression.

                            Code:
                            xtreg y L.x ib2016.year, fe cluster(id)
                            xi: xtreg y L.x i.year, fe cluster(id)

                            Comment


                            • #15
                              Dear Andrew, Thank you so much for these explanations. It really helps.
                              Ho-Chuan (River) Huang
                              Stata 17.0, MP(4)

                              Comment

                              Working...
                              X