System GMM - Time dummies

Jonathan Eberle

Join Date: Jan 2016

Posts: 8
#1

System GMM - Time dummies

20 Sep 2016, 03:41

Dear Stata users,

I am using the System GMM approach for estimation and I want to introduce time dummies as well. For example as:

(1) y l.y yr*, robust small gmmstyle(y) ivstyle(yr*).

In the seminal papers, the time dummies are always included as:

(2) y l.y yr*, robust small gmmstyle(y) ivstyle(yr*, eq(level)).

I do not understand the reason for that, because the time dummies do not disappear in the First-Difference equation (they range from -1 to 1 now, which is no problem as the relative range is the same as before). Therefore, in the second approach (as far as I do understand) the time dummies do not belong to the instrument matrix in the FD-equation and they are treated as endogenous. Is this interpretation right? And why isn't it better to use the first approach?

Thanks you!
Best regards

Last edited by Jonathan Eberle; 20 Sep 2016, 04:04.
Tags: gmm, panel, time dummies
Sebastian Kripfganz

Join Date: May 2014

Posts: 2594
#2

20 Sep 2016, 06:14

Which "seminal papers" are you refering to?

The second approach is indeed the right one to go. Once you have controlled for the time dummies in the level equation, they become redundant in the first-differenced equation. In more technical words, you can write the moment conditions for the time dummies in the first-differenced equation as linear combinations of the moment conditions for the time dummies (and the intercept) in the level equation.

In finite samples, in particular if your sample is unbalanced and due to a non-optimal or estimated weighting matrix, this equivalence does not hold exactly but adding the time dummies to the first-differenced equation does not improve the finite-sample performance.

More severely, the standard GMM estimation commands in Stata (xtabond, xtdpdsys, xtdpd) as well as the user-written command xtabond2 compute the wrong number of instruments if both sets of dummies are included. Because of the way how the estimators are implemented in Stata, the linear dependence of these instruments between the first-differenced and the level equation is not detected. (In fact, it is misleading to think about it as two equations because all instruments for the first-differenced equation can be transformed into instruments for the level equation. Despite this algebraic equivalence this is usually not done as it is less intuitive to the reader.) This is a severe issue because the postestimation tests for the validity of the overidentifying restrictions (Sargan/Hansen test) will then be based on the wrong number of the degrees of freedom (which depend on the number of overidentifying restrictions / instruments). What makes it really worrying is that the test becomes less conservative if the degrees-of-freedom are overreported.

https://www.kripfganz.de/stata/
2 likes
Comment
Jonathan Eberle

Join Date: Jan 2016

Posts: 8
#3

20 Sep 2016, 06:24

Dear Mr. Kripfganz,

thank you so much for your quick response! Now I do understand the difference between command (1) and (2) and why I should use the latter one. When I wrote "seminal papers", I mainly refer to the seminal paper of the xtabond2 command from Mr. Roodman.

Again, thanks for your informative answer!

Best regards
Comment
Jonathan Eberle

Join Date: Jan 2016

Posts: 8
#4

26 Oct 2016, 02:08

Dear all,

sorry for renewing this issue... I do understand why I have to choose the STATA command (2). But just for being sure, one more question about the technical implementation: are the first-differenced time dummies exogenous in the FD equation, despite the iv(yr*, eq(level)) specification? As far as I understand one can write "the moment conditions for the time dummies in the first-differenced equation as linear combinations of the moment conditions for the time dummies (and the intercept) in the level equation" (see above). But does the xtabond2 command use this fact for the estimation of the FD-equation, although the FD time dummies are not definied as instruments in command (2)?

Again, thank you for your comments!
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2594
#5

26 Oct 2016, 03:26

The confusion probably stems from the notion that there are two separate equations (first differences and levels) put together in a system. In fact, it is all just one model, as already mentioned in my previous post. Technically, all the instruments for the first-differenced equation can be rewritten as instruments for the level equation only because the first-differenced equation is just a transformation of the level equation. That is, instead of transforming the equation, you can use an appropriate transformation of the instruments to obtain the same result. This is less intuitive which is why the system-GMM estimator is usually presented as if there were two separate equations stacked above each other.

Now, if there is in fact just one equation (the model in levels) then it is clear once again, why we only need the time dummies to be instrumented in this level equation. The first-differenced equation is not estimated separately such that there is no need to worry about your concerns.

https://www.kripfganz.de/stata/
1 like
Comment
Jonathan Eberle

Join Date: Jan 2016

Posts: 8
#6

26 Oct 2016, 04:08

Dear Mister Kripfganz,

thanks again for your quick respond, considering the "system" as one equation makes it much easier for me to understand this issue!

Consequently, using System GMM and the ivstyle option for my exogenous regressors, I generally have to add the suboption eq(level).

Best regards
Comment

Sebastian Kripfganz

Join Date: May 2014
Posts: 2594

26 Oct 2016, 07:39

Originally posted by Jonathan Eberle View Post

Consequently, using System GMM and the ivstyle option for my exogenous regressors, I generally have to add the suboption eq(level).

If your exogenous regressors are all uncorrelated with the unobserved effects (much as in a random-effects model), then you only need to specify them for the level equation to identify their corresponding coefficients. However, using their first-differences as instruments for the first-differenced equation (as if it was a separate equation) can help to obtain more efficient estimates for the other coefficients of the endogenous variables. This works because unlike the time dummies (which are identical for all the individuals) the moment conditions for the exogenous regressors from the first-differenced equation are usually not linear functions of the corresponding moment conditions for the level equation.

That said, the ivstyle() option of xtabond2 (and similarly the iv() option of the offical command xtdpd) is a bit dangerous when used without the equation() suboption. It does not what you might expect it to do (when you did not carefully read all the documentation) and it certainly does not what the system-GMM literature, e.g. the seminal paper by Blundell and Bond (1998), suggests! Intuitively, you would expect to obtain the same result when you use the option ivstyle() without the equation() suboption and when you are specifying separate ivstyle() options for both the first-differenced and the level equation instead. The following example illustrates that this is not the case:

Code:

. webuse psidextract

. xtabond2 lwage L.lwage wks, gmmstyle(L.lwage) ivstyle(wks) nodiffsargan
Favoring space over speed. To switch, type or click on mata: mata set matafavor speed, perm.

Dynamic panel-data estimation, one-step system GMM
------------------------------------------------------------------------------
Group variable: id                              Number of obs      =      3570
Time variable : t                               Number of groups   =       595
Number of instruments = 22                      Obs per group: min =         6
Wald chi2(2)  =   4201.81                                      avg =      6.00
Prob > chi2   =     0.000                                      max =         6
------------------------------------------------------------------------------
       lwage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       lwage |
         L1. |   .8307673   .0128247    64.78   0.000     .8056313    .8559034
             |
         wks |  -.0003379    .000814    -0.42   0.678    -.0019334    .0012576
       _cons |   1.233894   .0917736    13.44   0.000     1.054021    1.413767
------------------------------------------------------------------------------
Instruments for first differences equation
  Standard
    D.wks
  GMM-type (missing=0, separate instruments for each period unless collapsed)
    L(1/6).L.lwage
Instruments for levels equation
  Standard
    wks
    _cons
  GMM-type (missing=0, separate instruments for each period unless collapsed)
    D.L.lwage
------------------------------------------------------------------------------
Arellano-Bond test for AR(1) in first differences: z = -18.34  Pr > z =  0.000
Arellano-Bond test for AR(2) in first differences: z =   3.63  Pr > z =  0.000
------------------------------------------------------------------------------
Sargan test of overid. restrictions: chi2(19)   = 293.56  Prob > chi2 =  0.000
  (Not robust, but not weakened by many instruments.)

. xtabond2 lwage L.lwage wks, gmmstyle(L.lwage) ivstyle(wks, equation(level)) ivstyle(wks, equation(diff)) nodiffsargan
Favoring space over speed. To switch, type or click on mata: mata set matafavor speed, perm.

Dynamic panel-data estimation, one-step system GMM
------------------------------------------------------------------------------
Group variable: id                              Number of obs      =      3570
Time variable : t                               Number of groups   =       595
Number of instruments = 23                      Obs per group: min =         6
Wald chi2(2)  =   4214.25                                      avg =      6.00
Prob > chi2   =     0.000                                      max =         6
------------------------------------------------------------------------------
       lwage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       lwage |
         L1. |    .830922   .0128337    64.75   0.000     .8057685    .8560755
             |
         wks |   .0008446    .000685     1.23   0.218     -.000498    .0021872
       _cons |   1.177407   .0893918    13.17   0.000     1.002203    1.352612
------------------------------------------------------------------------------
Instruments for first differences equation
  Standard
    D.wks
  GMM-type (missing=0, separate instruments for each period unless collapsed)
    L(1/6).L.lwage
Instruments for levels equation
  Standard
    wks
    _cons
  GMM-type (missing=0, separate instruments for each period unless collapsed)
    D.L.lwage
------------------------------------------------------------------------------
Arellano-Bond test for AR(1) in first differences: z = -18.37  Pr > z =  0.000
Arellano-Bond test for AR(2) in first differences: z =   3.64  Pr > z =  0.000
------------------------------------------------------------------------------
Sargan test of overid. restrictions: chi2(20)   = 300.35  Prob > chi2 =  0.000
  (Not robust, but not weakened by many instruments.)

As you can see, the second specification is based on one more instrument (23) than the first (22). At the same time, the list of instruments below the respective regression tables is identical. That is confusing, isn't it? This does not mean that the first specification is incorrect but it is just not doing what you probably think it does. (Without going too much into the details, the first specification is to some extent really treating the two equations as initially separate equations and then pools the first-differenced and the level observations.)

My recommendation: NEVER use the ivstyle() option without the equation() suboption! If you want to have standard instruments for both equations, then simply specify them separately as in the second example above. That way, you can be sure that xtabond2 really does what you usually want it to do.

https://www.kripfganz.de/stata/

Comment

Jonathan Eberle

Join Date: Jan 2016

Posts: 8
#8

26 Oct 2016, 08:12

Dear Mister Kripfganz,

thanks for the explanation and the hint regarding the suboptions for the ivstyle command!

Unfortunately, I am somewhat confused about your first section:

However, using their first-differences as instruments for the first-differenced equation (as if it was a separate equation) can help to obtain more efficient estimates for the other coefficients of the endogenous variables. This works because unlike the time dummies (which are identical for all the individuals) the moment conditions for the exogenous regressors from the first-differenced equation are usually not linear functions of the corresponding moment conditions for the level equation.

It is sufficient to use the eq(level) suboption for exogenous regressors, because

Technically, all the instruments for the first-differenced equation can be rewritten as instruments for the level equation only because the first-differenced equation is just a transformation of the level equation.

for time dummies respectively

In more technical words, you can write the moment conditions for the time dummies in the first-differenced equation as linear combinations of the moment conditions for the time dummies (and the intercept) in the level equation.

For any exogenous regressor, I only have to include moment conditions (instruments) from the level equation, because they can be rewritten as instruments for the FD equation. I can't see the difference between them and the time dummies.

Best regards
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2594
#9

26 Oct 2016, 11:54

For simplicity, consider the static panel model with \(T = 3\) time periods,
\[y_{it} = x_{it} \beta + \gamma_t + \alpha + \epsilon_{it}\]
with the combined error term
\[\epsilon_{it} = u_i + e_{it}\]

The regressor matrix \(\mathbf{X}_i\) including the intercept and the time dummies (leaving out the first dummy as a reference group) looks as follows:
\[\mathbf{X}_i = \begin{pmatrix}x_{i1} & 0 & 0 & 1\\x_{i2} & 1 & 0 & 1\\x_{i3} & 0 & 1 & 1\end{pmatrix}\]

Say, all regressors are exogenous. The instruments for the level equation are just the variables in matrix \(\mathbf{X}_i\) itself. The corresponding moment conditions, column by column, are \(E [\sum_{t=1}^T x_{it} \epsilon_{it}] = 0\), \(E [\epsilon_{i2}] = 0\), \(E [\epsilon_{i3}] = 0\), and \(E [\sum_{t=1}^T \epsilon_{it}] = 0\).

In the first-differenced equation,
\[\Delta y_{it} = \Delta x_{it} \beta + \Delta \gamma_t + \Delta \epsilon_{it}\]
the corresponding moment conditions would be \(E [\sum_{t=2}^T \Delta x_{it} \Delta \epsilon_{it}] = 0\), \(E [\Delta \epsilon_{i2}] = 0\) and \(E [\Delta \epsilon_{i3}] = 0\).

Now observe for the latter that \(E [\Delta \epsilon_{i3}] = E [\epsilon_{i3}] - E [\epsilon_{i2}] = 0\) is already implied by the second and third moment condition from the model in levels. Similarly, you can write \(E [\Delta \epsilon_{i2}] = E [\epsilon_{i2}] - E [\epsilon_{i1}] = E [\epsilon_{i2}] - (E [\sum_{t=1}^T \epsilon_{it}] - E [\epsilon_{i2}] - E [\epsilon_{i3}]) = 0\) in terms of the moment conditions from the model in levels.

However, in general you cannot write the first moment condition from the first-differenced model in terms of the corresponding moment condition from the model in levels. Hence, it is not redundant as opposed to those implied by the time dummies.

My earlier comment - that technically the instruments for the first-differenced equation can be rewritten as instruments for the level equation only - does not imply that these rewritten instruments are identical to the usual instruments for the level equation. In fact, the first moment condition from the first-differenced equation can be rewritten as
\[E [\sum_{t=2}^T \Delta x_{it} \Delta \epsilon_{it}] = E [-\Delta x_{i2} \epsilon_{i1} + (\Delta x_{i2} - \Delta x_{i3}) \epsilon_{i2} + \Delta x_{i3} \epsilon_{i3}] = 0\]
which would correspond to an instrument for the level equation that looks as follows:
\[\begin{pmatrix}-\Delta x_{i2}\\\Delta x_{i2} - \Delta x_{i3}\\\Delta x_{i3}\end{pmatrix}\]
You cannot write this instrument as a linear combination of the columns of the matrix \(\mathbf{X}_i\).

Last edited by Sebastian Kripfganz; 26 Oct 2016, 11:59.

https://www.kripfganz.de/stata/
Comment
Jonathan Eberle

Join Date: Jan 2016

Posts: 8
#10

27 Oct 2016, 04:01

Dear Mister Kripfganz,

again, thank you so much for taking your time and giving me concrete answers and examples. This really helps me to understand this issues.

To sum it up:

(1) Using the iv(yr*, eq(level)) command doesn't influence the coefficients of the regression (regressions with the eq(level) and without the eq(level) command lead to the same coefficients of the time dummies IF your panel is balanced), because one can write the "moment conditions for the time dummies in the first-differenced equation as linear combinations of the moment conditions for the time dummies (and the intercept) in the level equation." Hence, they are redundant. The problem is that omitting the eq(level) command will lead to the "wrong" number of instruments (degrees of freedom), the postestimation tests are biased (actually, when I tried a regression with and one without the eq(level) command the number of instruments as well as the Hansen J-Test are the same).

(2) If one uses a metric exogenous regressor, using the iv(yr*, eq(level)) command in a System GMM approach is sufficient as long as the regressor is uncorrelated with the unobservable fixed effect. But the command is not "necessary" as it is not possible to rewrite the FD-instruments as level instruments and vice versa (no redundancy):

However, in general you cannot write the first moment condition from the first-differenced model in terms of the corresponding moment condition from the model in levels. Hence, it is not redundant as opposed to those implied by the time dummies.... You cannot write this instrument as a linear combination of the columns of the matrix Xi

Hence, omitting the eq(level) command may help to get more efficient coefficients of the endogenous variables (and it changes the coefficient of the exogenous variable due to the "new" moment conditions in FD).

Again, thank you for your illuminating answers!
Comment

Sebastian Kripfganz

Join Date: May 2014
Posts: 2594

#11

27 Oct 2016, 06:49

1)
Yes and no. Unfortunately, this depends on a further technicality, namely the chosen first-step weighting matrix, option h(#) for xtabond2. The default is h(3) which would be an optimal weighting matrix if there were no unobserved unit-specific effects; see help xtabond2. Unfortunately, this option can have a large effect on the coefficient estimates and this default specification is not the weighting matrix used e.g. by Blundell and Bond (1998), which is obtained with option h(1), or Blundell, Bond, and Windmeijer (2001), which is obtained with option h(2). Also notice that Stata's official command xtdpd does not have an option to choose the weighting matrix but always uses the one that corresponds to h(2) with xtabond2.

With the default, h(3), you are right. The results are identical and the (asymptotically) correct number of instruments is computed, no matter how you specify the time dummies, if the panel is strongly balanced. With the other two weighting matrices, this is no longer the case. The following two specifications yield different results. But remember, the equivalence of the moment conditions outlined earlier did not depend at all on the weighting matrix. Also, in the second specification the number of instruments is (asymptotically) too large. Therefore, the p-values from the postestimation commands are incorrect and, particularly worrisome, they become less conservative.

Code:

. webuse psidextract

. xtset
       panel variable:  id (strongly balanced)
        time variable:  t, 1 to 7
                delta:  1 unit

. xtabond2 L(0/1).lwage wks tdum3-tdum7, gmmstyle(L.lwage) ivstyle(wks, equation(level)) ivstyle(tdum3-tdum7, equation(level)) nodiffsargan h(2)
Favoring space over speed. To switch, type or click on mata: mata set matafavor speed, perm.

Dynamic panel-data estimation, one-step system GMM
------------------------------------------------------------------------------
Group variable: id                              Number of obs      =      3570
Time variable : t                               Number of groups   =       595
Number of instruments = 27                      Obs per group: min =         6
Wald chi2(7)  =   7362.65                                      avg =      6.00
Prob > chi2   =     0.000                                      max =         6
------------------------------------------------------------------------------
       lwage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       lwage |
         L1. |   .4728387   .0212769    22.22   0.000     .4311368    .5145407
             |
         wks |   .0023417   .0005751     4.07   0.000     .0012145     .003469
       tdum3 |   .0904189   .0070883    12.76   0.000     .0765261    .1043117
       tdum4 |   .1268809   .0082844    15.32   0.000     .1106438    .1431181
       tdum5 |   .1705239    .009655    17.66   0.000     .1516004    .1894473
       tdum6 |   .2056137   .0111112    18.51   0.000     .1838362    .2273912
       tdum7 |   .2567683   .0124694    20.59   0.000     .2323286    .2812079
       _cons |   3.339804   .1340032    24.92   0.000     3.077163    3.602446
------------------------------------------------------------------------------
Instruments for first differences equation
  GMM-type (missing=0, separate instruments for each period unless collapsed)
    L(1/6).L.lwage
Instruments for levels equation
  Standard
    tdum3 tdum4 tdum5 tdum6 tdum7
    wks
    _cons
  GMM-type (missing=0, separate instruments for each period unless collapsed)
    D.L.lwage
------------------------------------------------------------------------------
Arellano-Bond test for AR(1) in first differences: z = -23.64  Pr > z =  0.000
Arellano-Bond test for AR(2) in first differences: z =   2.76  Pr > z =  0.006
------------------------------------------------------------------------------
Sargan test of overid. restrictions: chi2(19)   = 260.33  Prob > chi2 =  0.000
  (Not robust, but not weakened by many instruments.)

. xtabond2 L(0/1).lwage wks tdum3-tdum7, gmmstyle(L.lwage) ivstyle(wks, equation(level)) ivstyle(tdum3-tdum7, equation(level)) ivstyle(tdum3-tdum7, equation(diff)) nodiffsargan h(2)
Favoring space over speed. To switch, type or click on mata: mata set matafavor speed, perm.

Dynamic panel-data estimation, one-step system GMM
------------------------------------------------------------------------------
Group variable: id                              Number of obs      =      3570
Time variable : t                               Number of groups   =       595
Number of instruments = 32                      Obs per group: min =         6
Wald chi2(7)  =   7430.95                                      avg =      6.00
Prob > chi2   =     0.000                                      max =         6
------------------------------------------------------------------------------
       lwage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       lwage |
         L1. |   .4634588   .0210482    22.02   0.000     .4222051    .5047125
             |
         wks |   .0023926   .0005722     4.18   0.000      .001271    .0035141
       tdum3 |   .0897153   .0070519    12.72   0.000     .0758939    .1035368
       tdum4 |   .1277802   .0082311    15.52   0.000     .1116474    .1439129
       tdum5 |   .1726556   .0095836    18.02   0.000     .1538722    .1914391
       tdum6 |   .2089168   .0110264    18.95   0.000     .1873054    .2305282
       tdum7 |   .2603197   .0123783    21.03   0.000     .2360586    .2845808
       _cons |   3.398084   .1325899    25.63   0.000     3.138212    3.657955
------------------------------------------------------------------------------
Instruments for first differences equation
  Standard
    D.(tdum3 tdum4 tdum5 tdum6 tdum7)
  GMM-type (missing=0, separate instruments for each period unless collapsed)
    L(1/6).L.lwage
Instruments for levels equation
  Standard
    tdum3 tdum4 tdum5 tdum6 tdum7
    wks
    _cons
  GMM-type (missing=0, separate instruments for each period unless collapsed)
    D.L.lwage
------------------------------------------------------------------------------
Arellano-Bond test for AR(1) in first differences: z = -23.61  Pr > z =  0.000
Arellano-Bond test for AR(2) in first differences: z =   2.71  Pr > z =  0.007
------------------------------------------------------------------------------
Sargan test of overid. restrictions: chi2(24)   = 298.48  Prob > chi2 =  0.000
  (Not robust, but not weakened by many instruments.)

It is a different story, which of these weighting matrices should be prefered. Yet, I am worried again that the default specification may not be what the average user has in mind and this makes it dangerous.

2)
Again, I would recommend to never use the ivstyle() option without the equation() suboption. If you want to use these instruments both for the first-differenced and for the level equation, specify two sets of instruments, e.g. for the variable wks in the above example: ivstyle(wks, equation(level)) and ivstyle(wks, equation(diff)). This is, in general, not the same as specifying ivstyle(wks) without the subobtion.

References:

Blundell, R. and S. R. Bond (1998). Initial conditions and moment restrictions in dynamic panel data models. Journal of Econometrics 87 (1), 115-143.
Blundell, R., S. R. Bond, and F. Windmeijer (2000). Estimation in dynamic panel data models: Improving on the performance of the standard GMM estimator. Advances in Econometrics 15 (1), 53-91.

Last edited by Sebastian Kripfganz; 27 Oct 2016, 06:52.

https://www.kripfganz.de/stata/

Comment

Hanna Lindstrom

Join Date: Apr 2017

Posts: 25
#12

18 Jun 2017, 02:46

Dear Mr Kripfganz,
I have a question regarding System GMM estimation. I hope it is ok that I write my question in this thread, please let me know if it is more appreciated that I start a new thread/topic.

I am using a panel data set with T=11 and N=288. Except regular year dummies, I also include a time dummy, D06, that controls for a shift in policy. That is; D06=1= after year 2006 and D06=0 = before year 2006. So this is a strictly exogenous variable that contains zeroes for all municipalities in the years 2003-2006 and 1:s for all municipalities after 2006. However, I am not sure of whether it should enter the ivstyle() as iv(D06, eq(diff)) or iv(D06, eq(level)) or both. What would be your take on this?

Best regards,
Hanna Lindström
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2594
#13

18 Jun 2017, 06:09

Dear Hanna,
Your question perfectly fits into this topic.

Conceptually, such a shift dummy is treated in the same way as the other time dummies. You would usually include it as iv(D06, eq(level)).

The interpretation becomes a bit tricky with this shift dummy and a set of regular year dummies in your regression, The shift dummy is technically just the sum of all year dummies from 2007 onwards. Similar to the situation of year dummies and the regression constant, when one of the year dummies needs to be dropped and the regression constant is interpreted as the corresponding base effect, the inclusion of the shift dummy forces you to drop another year dummy (or Stata will do it automatically). Suppose, you drop the year dummy for 2007. All the subsequent year effects will get the interpretation as relative effects to the base year 2007 AND to the base year that got excluded because of the regression constant (say, 2003). The shift dummy itself will also capture only the relative effect of the period starting 2006 to the base year 2003.

https://www.kripfganz.de/stata/
Comment
Hanna Lindstrom

Join Date: Apr 2017

Posts: 25
#14

19 Jun 2017, 07:06

Dear Mr Kripfganz,
Thank you for your reply. I see your point regarding the fact that the dummy D06 is the sum of all year dummies from 2007 and onwards. However, I am not quite sure what you mean with your last advice. "The shift dummy itself will also capture only the relative effect of the period starting 2006 to the base year 2003."

Would it be possible to elaborate on this piece of advice?

Best,
Hanna
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2594
#15

20 Jun 2017, 04:03

Suppose you have a constant in your model and T-1 time dummies. The constant measures the base effect of the omitted time category, say 2003. A simple time dummy for the year 2007 would then capture the relative effect of 2007 to the omitted base category, 2003. Similarly for time dummies for the years 2008 and so on. The same is true for the shift dummy. From a different perspective, in the year 2007 both the shift dummy and the constant term take on the value 1. The overall effect for the year 2007 is thus the sum of the shift dummy coefficient and the regression constant (call this sum A). Since the constant itself measures the effect for the base year (call it B), the shift dummy itself captures the difference A-B.

https://www.kripfganz.de/stata/
Comment

Announcement