XTDPDGMM: new Stata command for efficient GMM estimation of linear (dynamic) panel models with nonlinear moment conditions

Sarah Magd replied

05 Mar 2023, 05:38
Dear Prof. @Kripfganz,

I specify my model in a static way (i.e., without including the lagged dependent variable as a regressor).

1. Can we still use the sys-GMM to estimate this static regression?
2. How should I justify the use of the sys-GMM to estimate this static regression? (i.e., is it more efficient or robust than the 2SLS regression?)
3. Do I still need to report the Arellano-Bond statistics?
Leave a comment:
Zainab Mariam replied

16 Feb 2023, 16:09
Dear Professor Sebastian,

I am so thankful for what you did. You are so helpful. I do appreciate the way you are teaching and supporting me. Your assistance means a lot to me, Professor!

1) Regarding post #514 “Here, it seems that the only instruments specified for the level model are the time dummies. In this case, the difference-in-Hansen test for them is not meaningful. We cannot leave out the instruments for the time dummies. Because the time dummies are the only instruments for the level model, the Blundell-Bond assumption does not apply here.”.

Thus, I have the following questions, please!

1.1) Does that mean the difference-in-Hansen test on pages 96, 109, 113, and 123 is not meaningful? Does that mean the Blundell-Bond assumption does not apply there in those tables on pages 96, 109, 113, and 123?

1.2) At least how many variables do I have to instrument for the level model in order for the difference-in-Hansen test to be meaningful and for the Blundell-Bond assumption to be applied?

2) Regarding the meaning of the following iv() options, given I have not specified ‘model(diff)’ as a separate option in your xtdpdgmm command line, is the following meaning of iv() options correct?

2.1) iv(x, lag( ) diff model(level)): produces differenced instruments for the level model?

2.2) iv(x, lag( ) diff): produces differenced instruments for the level model?

2.3) iv(x, lag( ) model(diff) diff): produces differenced instruments for the differenced model?

2.4) iv(x, lag( ) model(level)): produces level instruments for the level model?

2.5) iv(x, lag( )): produces level instruments for the level model?

2.6) iv(x, lag( ) model(diff)): produces level instruments for the differenced model?

3) Regarding the meaning of the following iv() options, given I have specified ‘model(FOD)’ as a separate option in the xtdpdgmm command line, is the following meaning of iv() options correct?

3.1) iv(x, lag( ) diff model(level)): produces differenced instruments for the level model?

3.2) iv(x, lag( ) diff): produces differenced instruments for the FOD model?

3.3) iv(x, lag( ) model(diff) diff): produces differenced instruments for the differenced model?

3.4) iv(x, lag( ) model(level)): produces level instruments for the level model?

3.5) iv(x, lag( )): produces level instruments for the FOD model?

3.6) iv(x, lag( ) model(diff)): produces level instruments for the differenced model?

4) Regarding post #506 point 6) “With the difference GMM estimator, the difference-in-Hansen test can still be useful to evaluate the validity of specific instrument sets.…”.

Thus, which instrument sets specifically the difference-in-Hansen test with the Difference GMM estimator can evaluate their validity?

5) To check my understanding, please, correct me if I am wrong!

5.1) The coefficient of L.y (L.y is the lagged dependent variable) based on the Difference GMM estimator indicates whether the dependent variable (y) is persistent, and hence, it indicates if the Difference GMM estimator poorly behaves? The lagged dependent variable’s coefficient obtained from applying the Difference GMM estimator refers to whether the dependent variable (y) is close to a random walk and if the Difference GMM estimator performs poorly? If the coefficient of L.y is close to 1, that indicates that the dependent variable (y) is persistent and the Difference GMM estimator yields poor performance due to the poor instruments?

5.2) Difference-in-Hansen test with the Difference GMM checks for variables classification?

5.3) Difference-in-Hansen test with the Difference GMM cannot check for the additional Blundell-Bond assumption (sufficient: mean stationarity)?

5.4) Difference-in-Hansen test with the System GMM cannot check for variables classification?

5.5) Difference-in-Hansen test with the System GMM checks for the additional Blundell-Bond assumption (sufficient: mean stationarity)?

Many thanks for doing what you do! Your patience, help and effort are greatly appreciated, Professor!
Leave a comment:
Sebastian Kripfganz replied

16 Feb 2023, 12:07
1.1) The approach can also be used with the system GMM estimator if you are confident that the additional assumption for validity of the instruments in the level model is satisfied.

1.2) With the difference GMM estimator alone, you cannot test the additional assumption for the system GMM estimator.

2.1) You do not lose much with the nonlinear estimator compared to the difference GMM estimator. So, yes, it is often preferable to use the nonlinear estimator.

2.2-2.8) All of these specifications are valid. Some of them are unusual/unconventional, e.g. 2.4).

2.9) Here, the option gmm(L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9, lag(1 3)) would only be valid if all of these variables were strictly exogenous.

2.10) Somewhere earlier in this thread I gave examples for different estimators, including the Ahn-Schmidt estimator.

2.11) model(mdev) should only be specified for strictly exogenous variables. Otherwise, both model(fod) and model(diff) are fine. (Just remember that in general the admissible lags differ for the two models; e.g. for an endogenous variable, the first admissible lag is 1 with model(fod) but 2 with model(diff).)

3) If you do not have multiple instrument sets for model(level), then the difference-in-Hansen test does not perform a separate test for it.

4) This might be because the industry dummies are time-invariant. Such variables can only be specified for model(level).
Leave a comment:
Zainab Mariam replied

15 Feb 2023, 10:47
Dear Professor Sebastian,

Even though I may not say it all the time, I do appreciate all that you do, Professor! I do not know what to say. Much obliged!

1) Regarding post #506 point 6) “With the difference GMM estimator, the difference-in-Hansen test can still be useful to evaluate the validity of specific instrument sets. This could for example help to decide whether variables should be classified as endogenous, predetermined, or exogenous; see the model selection section of my presentation.”.

Thus, I have the following questions, please!

1.1) Does it mean the difference-in-Hansen test cannot help to decide whether variables should be classified as endogenous, predetermined, or exogenous if we apply the difference-in-Hansen test with the System GMM estimator?

1.2) Does it mean the difference-in-Hansen test cannot help to check if the variables satisfy the additional Blundell-Bond assumption (sufficient: mean stationarity) if we apply the difference-in-Hansen test with the Difference GMM estimator? Does it mean the difference-in-Hansen test cannot help to check if I can instrument the variables in the level model when we apply the difference-in-Hansen test with the Difference GMM estimator? Does it mean the difference-in-Hansen test cannot help to check if I can apply the System GMM estimator when we apply the difference-in-Hansen test with the Difference GMM estimator?

2) Regarding post #508 point 1) “In principle, the MMSC can be used for selecting between the difference and system GMM estimator, yes. If different criteria give you different answers, I am afraid then the decision is still up to you. You will then need to weigh the benefits and shortcomings of the two estimators. As mentioned earlier, a good compromise might be the difference GMM estimator plus nonlinear moment conditions (Ahn-Schmidt).”. And regarding post #504 point 3) “… Alternatively, you could use the nonlinear Ahn and Schmidt (1995, Journal of Econometrics) estimator, which also mitigates the weak-instruments problem but does not require the additional system GMM assumptions.”.

Thus, I have the following questions, please!

2.1) Does it mean it is better to apply the nonlinear Ahn and Schmidt estimator? If so, are the following codes correct?

2.2) In this code, I specified ‘model(fod)’ as a separate option in the xtdpdgmm command line, I put ‘model(md)’ in the iv() option for the dummies.

xtdpdgmm L.(0/1) y L.(x1 x2 x3 x4 x5 x6 x7 x8 x9) x10 i.ind mn cf cf*L.x1, model(fod) collapse gmm(y, lag(1 4)) gmm(L.x1, lag(1 4)) gmm(L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9, lag(0 3)) gmm(x10, lag(0 2)) gmm(x10, model(md) lag(0 0)) iv(i.ind, model(md)) iv(mn, model(md)) iv(fc, model(md)) gmm(cf*L.x1, lag(1 3)) nl(noserial)) teffects two small vce(robust, dc) overid

2.3) In this code, I put ‘model(level)’ in the iv() option for the dummies.

xtdpdgmm L.(0/1) y L.(x1 x2 x3 x4 x5 x6 x7 x8 x9) x10 i.ind mn cf cf*L.x1, model(fod) collapse gmm(y, lag(1 4)) gmm(L.x1, lag(1 4)) gmm(L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9, lag(0 3)) gmm(x10, lag(0 2)) gmm(x10, model(md) lag(0 0)) iv(i.ind, model(level)) iv(mn, model(level)) iv(fc, model(level)) gmm(cf*L.x1, lag(1 3)) nl(noserial)) teffects two small vce(robust, dc) overid

2.4) In this code, I put ‘diff’ in the iv() option for the dummies.

xtdpdgmm L.(0/1) y L.(x1 x2 x3 x4 x5 x6 x7 x8 x9) x10 i.ind mn cf cf*L.x1, model(fod) collapse gmm(y, lag(1 4)) gmm(L.x1, lag(1 4)) gmm(L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9, lag(0 3)) gmm(x10, lag(0 2)) gmm(x10, model(md) lag(0 0)) iv(i.ind, diff) iv(mn, diff) iv(fc, diff) gmm(cf*L.x1, lag(1 3)) nl(noserial)) teffects two small vce(robust, dc) overid

2.5) In this code, I put ‘diff model(diff)’ in the iv() option for the dummies.

xtdpdgmm L.(0/1) y L.(x1 x2 x3 x4 x5 x6 x7 x8 x9) x10 i.ind mn cf cf*L.x1, model(fod) collapse gmm(y, lag(1 4)) gmm(L.x1, lag(1 4)) gmm(L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9, lag(0 3)) gmm(x10, lag(0 2)) gmm(x10, model(md) lag(0 0)) iv(i.ind, diff model(diff)) iv(mn, diffmodel(diff)) iv(fc, diffmodel(diff)) gmm(cf*L.x1, lag(1 3)) nl(noserial)) teffects two small vce(robust, dc) overid

2.6) In this code, I put ‘model(diff)’ in the iv() option for the dummies.

xtdpdgmm L.(0/1) y L.(x1 x2 x3 x4 x5 x6 x7 x8 x9) x10 i.ind mn cf cf*L.x1, model(fod) collapse gmm(y, lag(1 4)) gmm(L.x1, lag(1 4)) gmm(L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9, lag(0 3)) gmm(x10, lag(0 2)) gmm(x10, model(md) lag(0 0)) iv(i.ind, model(diff)) iv(mn, model(diff)) iv(fc, model(diff)) gmm(cf*L.x1, lag(1 3)) nl(noserial)) teffects two small vce(robust, dc) overid

2.7) In this code, I put ‘diff model(level)’ in the iv() option for the dummies.

xtdpdgmm L.(0/1) y L.(x1 x2 x3 x4 x5 x6 x7 x8 x9) x10 i.ind mn cf cf*L.x1, model(fod) collapse gmm(y, lag(1 4)) gmm(L.x1, lag(1 4)) gmm(L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9, lag(0 3)) gmm(x10, lag(0 2)) gmm(x10, model(md) lag(0 0)) iv(i.ind, diff model(level)) iv(mn, diffmodel(level)) iv(fc, diffmodel(level)) gmm(cf*L.x1, lag(1 3)) nl(noserial)) teffects two small vce(robust, dc) overid

2.8) In this code, I did not put anything in the iv() option for the dummies.

xtdpdgmm L.(0/1) y L.(x1 x2 x3 x4 x5 x6 x7 x8 x9) x10 i.ind mn cf cf*L.x1, model(fod) collapse gmm(y, lag(1 4)) gmm(L.x1, lag(1 4)) gmm(L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9, lag(0 3)) gmm(x10, lag(0 2)) gmm(x10, model(md) lag(0 0)) iv(i.ind) iv(mn) iv(fc) gmm(cf*L.x1, lag(1 3)) nl(noserial)) teffects two small vce(robust, dc) overid

2.9) In this code, I specified ‘model(mdev)’ as a separate option in the xtdpdgmm command line, I put ‘model(diff)’ in the gmm() for the endogenous variables (y, L.x1), I put ‘norescale’ in the iv() for the exogenous variable (x10), I put ‘model(md)’ in the iv() option for the dummies, and I did not put any option in gmm() for the predetermined variables (L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9).

xtdpdgmm L.(0/1) y L.(x1 x2 x3 x4 x5 x6 x7 x8 x9) x10 i.ind mn cf cf*L.x1, model(mdev) collapse gmm(y, lag(2 4) model(diff)) gmm(L.x1, lag(2 4) model(diff)) gmm(L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9, lag(1 3)) iv(x10, norescale) gmm(x10, model(md) lag(0 0)) iv(i.ind, model(md)) iv(mn, model(md)) iv(fc, model(md)) gmm(cf*L.x1, lag(2 4) model(diff)) nl(noserial)) teffects two small vce(robust, dc) overid

2.10) If none of the previous codes is correct, what is the correct code I have to use in order to implement the nonlinear Ahn and Schmidt estimator using your xtdpdgmm command? Where: y is the dependent variable; L.y is the lagged dependent variable as a regressor (L.y is predetermined); L.x1 is the independent variable (L.x1 is endogenous); The control variables L.x2, L.x3, L.x4, L.x5, L.x6, L.x7, L.x8, L.x9 are predetermined; The control variable x10 (firm age) is exogenous; ind is industry dummies; mn is country dummies; cf is a dummy variable that takes the value of 1 for the 3 years 2008, 2009, and 2010; cf*L.x1 is an interaction between the dummy variable cf and the independent variable L.x1.

2.11) To apply the nonlinear Ahn and Schmidt estimator, is it better to specify ‘model(fod)’ or ‘model(diff)’ or ‘model(mdev)’ as a separate option in the xtdpdgmm command line?

3) What if the Difference-in-Hansen test’s results do not obtain “model(level)” in the last line/row of the Difference-in-Hansen test table? What does that indicate?

4) Is it normal for all the industry dummies to be omitted if I put ‘md’ in the iv() option for the industry dummies along with not typing ‘teffects’ in the xtdpdgmm command line?

Also, is it normal for more than one industry dummy to be omitted if I put ‘md’ in the iv() option for the industry dummies even with typing ‘teffects’ in the xtdpdgmm command line? What are the iv() options that lead the dummies to be omitted?

Sorry to keep asking you my questions, but I would not have understood this without your assistance. Please accept my deepest gratitude. Your patience, help and effort are greatly appreciated, Professor! Thank you very much for all you do.
Leave a comment:
Sebastian Kripfganz replied

15 Feb 2023, 08:53
1.1) Please see my earlier post #506 for some general information. Here, it seems that the only instruments specified for the level model are the time dummies. In this case, the difference-in-Hansen test for them is not meaningful. We cannot leave out the instruments for the time dummies.

1.2) Because the time dummies are the only instruments for the level model, the Blundell-Bond assumption does not apply here.

1.3) Again, this row in this case has no meaningful interpretation.
Leave a comment:

Zainab Mariam replied

10 Feb 2023, 09:37

Dear Professor Sebastian,

Many many thanks for your swift and useful reply. I want to express my deep gratitude for the dedicated work you do day after day. Your input is so valuable, Professor!

1) Using your xtdpdgmm command, suppose the following code is typed to apply the Difference GMM estimator:

xtdpdgmm L(0/2).y L.(x1 x2 x3 x4 x5 x6 x7 x8 x9) x10, model(fod) collapse gmm(y, lag(1 3)) gmm(L.x1, lag(1 3)) gmm(L.x2, lag(0 2)) gmm(L.x3, lag(0 2)) gmm(L.x4, lag(0 2)) gmm(L.x5, lag(0 2)) gmm(L.x6, lag(0 2)) gmm(L.x7, lag(0 2)) gmm(L.x8, lag(0 2)) gmm(L.x9, lag(0 2)) gmm(x10, lag(0 2)) gmm(x10, lag(0 0) model(md)) teffects two small vce(r) nocons overid

Then, the serial correlation and overidentification tests are applied. After that, the difference-in-Hansen test is performed. Thus, I have the following questions, please!

1.1) What is the meaning of the following findings of the row/line labelled “model(level)” in the difference-in-Hansen test?

Table 1

	Excluding			Difference
Moment conditions	chi2	df	p	chi2	df	p
12, model(level)	12.0861	7	0.0978	4.1102	2	0.0881
model(diff)	0.0000	0	.	16.1962	9	0.0629
model(level)	8.0920	6	0.0428	8.1042	3	0.0315

Table 2

	Excluding			Difference
Moment conditions	chi2	df	p	chi2	df	p
12, model(level)	12.0861	7	0.0397	4.1102	2	0.1364
model(diff)	0.0000	0	.	16.1962	9	0.0629
model(level)	8.0920	6	0.0327	8.1042	3	0.1011

Table 3

	Excluding			Difference
Moment conditions	chi2	df	p	chi2	df	p
12, model(level)	12.0861	7	0.1231	4.1102	2	0.0272
model(diff)	0.0000	0	.	16.1962	9	0.0629
model(level)	8.0920	6	0.1114	8.1042	3	0.0439

Table 4

	Excluding			Difference
Moment conditions	chi2	df	p	chi2	df	p
12, model(level)	12.0861	7	0.0779	4.1102	2	0.1281
model(diff)	0.0000	0	.	16.1962	9	0.0629
model(level)	8.0920	6	0.1015	8.1042	3	0.1026

1.2) Do the above findings of the row/line labelled “model(level)” in the difference-in-Hansen test indicate that the variables satisfy or violate the additional Blundell-Bond assumption (sufficient: mean stationarity)?

1.3) Do the above findings of the row/line labelled “model(level)” in the difference-in-Hansen test indicate that I can instrument the variables in the level model? Do the above findings of the row/line labelled “model(level)” in the difference-in-Hansen test indicate that I can apply the System GMM estimator?

1.4) What do the above findings of the difference-in-Hansen test indicate regarding the variables classification whether exogenous, predetermined, or endogenous?

Thank you in advance for your time and hard work. I am so thankful for everything you bring to my understanding, Professor!

Leave a comment:

Sebastian Kripfganz replied

10 Feb 2023, 06:15
1) With global option model(diff), 1.2 and 1.3 are equivalent, 1.5 and 1.6 are equivalent. 1.7 does not exist.

2) Yes for 2.1-2.5.
2.6) If model(diff) was already specified, then specifying it again within iv() is redundant. It does not change anything.
2.7) This option does not exist.

3) Without separate option model(diff), the default is model(level). Then, 1.1 and 1.2 would be equivalent, 1.4 and 1.5 would be equivalent.

4) With separate option model(fod), none of the specifications are equivalent.
Leave a comment:
Zainab Mariam replied

10 Feb 2023, 05:26
Dear Professor Sebastian,

Thank you so much for your valuable response. How a great supervisor you are! I sincerely appreciate your assistance, Professor! I have the following questions, please!

1) If I have specified ‘model(diff)’ as a separate option in your xtdpdgmm command line, are the following iv() options equivalent or different?

1.1) iv(x, lag( ) diff model(level))

1.2) iv(x, lag( ) diff)

1.3) iv(x, lag( ) model(diff) diff)

1.4) iv(x, lag( ) model(level))

1.5) iv(x, lag( ))

1.6) iv(x, lag( ) model(diff))

1.7) iv(x, lag( ) level)

2) Regarding the meaning of the above iv() options, given I have specified ‘model(diff)’ as a separate option in your xtdpdgmm command line, is the following meaning of iv() options correct?

2.1) iv(x, lag( ) diff model(level)): produces differenced instruments for the level model?

2.2) iv(x, lag( ) diff): produces differenced instruments for the differenced model?

2.3) iv(x, lag( ) model(diff) diff): produces differenced instruments for the differenced model?

2.4) iv(x, lag( ) model(level)): produces level instruments for the level model?

2.5) iv(x, lag( )): produces level instruments for the differenced model?

2.6) iv(x, lag( ) model(diff)): I do not know the meaning of this iv() option, especially since 'model(diff)' has been already specified as a separate option in the command line. I think this iv() option means: produces level instruments for the differenced model? But, I do not think I can include 'model(diff)' twice in the command.

2.7) iv(x, lag( ) level): I do not know the meaning of this iv() option.

3) If I have not specified ‘model(diff)’ as a separate option in the xtdpdgmm command line, will the answers to the above questions (questions 1 and 2) be different? If so, how? Please!

4) If I have specified ‘model(FOD)’ as a separate option in the xtdpdgmm command line, will the answers to the questions above (questions 1 and 2) be different?

Thank you again for your support and effort, Professor! That made a real difference in my understanding.
Leave a comment:
Sebastian Kripfganz replied

10 Feb 2023, 03:12
1) You would just specify iv(dummy, model(fod)) without any further suboption for instrument transformation.

2) You would normally treat such an interaction the same way as any other variable. If one of the variables in the interaction term, the interaction term itself should usually also be treated as endogenous.

3) You would use factor variable notation and specify i.cf in the list of independent variables plus iv(i.cf, model(level)) or iv(i.cf, model(diff) diff) or iv(i.cf, model(fod)), depending on whether you want to use only instruments for the transformed model.

4) This is just an illustration of what the gmm() option really does; it creates standard instruments interacted with time dummies. You would not normally do this yourself manually, but just use the gmm() option.

5) This is strictly speaking not a difference GMM estimator, because it also uses nonlinear moment conditions.
5.1) By default, option teffects always instruments time dummies in the level model, irrespective of what is specified in the model() option. This can be changed by adding option nolevel. Thus, the options model(diff) nolevel teffects would create time dummy instruments for the first-differenced model.
5.2) Option teffects always creates level instruments, irrespective of the model.

6.1) As the option nolevel is not specified, teffects creates instruments for the level model. You can easily see this yourself by looking at the list of instruments displayed below the regression output.
6.2) See 5.2).
Leave a comment:
Zainab Mariam replied

06 Feb 2023, 08:54
Dear Professor Sebastian,

Thank you very much for your beneficial reply. I do appreciate your cooperation, support and patience, professor! If I may follow up with your response, please!

1) Regarding post #508 point 5.1) “For the FOD-transformed equations, you do not need to transform the dummies.”. Sorry, I did not get what you mean by that.

2) My regression model includes the endogenous variable (L.x1) and it also includes the dummy variable (cf) {where this dummy variable cf takes the value of 1 for the 3 years 2008, 2009, 2010}. Also, my regression model includes an interaction between the endogenous variable (L.x1) with the dummy variable (cf). Where: L.x1 is the independent variable and it is endogenous and continuous.

Thus, how do I have to type/express this interaction between the endogenous variable (L.x1) with the dummy variable (cf) in the regression code using your command xtdpdgmm?

3) If I am not using the teffects option, then how do I have to include the time dummies explicitly in my regression model? How do I have to express/type the time dummies explicitly in my regression model code using your command xtdpdgmm? Suppose the research’s time period is 2000-2020.

4) Your code on slide 22 of your 2019 London Stata Conference presentation is: xtdpdgmm L(0/1).n w k, model(diff) iv(i.year#cL(2/4).n) iv(i.year#cL(1/3).w) iv(i.year#cL(0/2).k) nocons two vce(r)

Thus, what do you mean by ‘iv(i.year#cL(2/4).n) iv(i.year#cL(1/3).w) iv(i.year#cL(0/2).k)’? And why to include that in the regression code?

5) The Difference GMM estimator is applied in your code on slide 80 of your 2019 London Stata Conference presentation where your code is: xtdpdgmm L(0/1).n w k, model(diff) collapse gmm(n, lag(2 4)) gmm(w k, lag(1 3)) nl(noserial) teffects igmm vce(r)

Thus, I have the following questions, please!

5.1) When you use ‘teffects’, are you instrumenting the year dummies in the differenced model or in the level model?

5.2) Also, are you using the differenced instruments or the level instruments for the year dummies?

6) The FOD estimator is applied in your code on slide 95 of your 2019 London Stata Conference presentation where your code is: xtdpdgmm L(0/3).n L(0/3).(w k ys), model(fod) collapse gmm(n, lag(1 .)) gmm(w, lag(1 .)) gmm(k, lag(1 .)) gmm(ys, lag(1 .)) teffects two vce(r) overid

Thus, I have the following questions, please!

6.1) When you use ‘teffects’, are you instrumenting the year dummies in the differenced model or in the level model?

6.2) Also, are you using the differenced instruments or the level instruments for the year dummies?

Your patience, help and effort are highly appreciated, Professor!
Leave a comment:
Sebastian Kripfganz replied

06 Feb 2023, 03:03
1) In principle, the MMSC can be used for selecting between the difference and system GMM estimator, yes. If different criteria give you different answers, I am afraid then the decision is still up to you. You will then need to weigh the benefits and shortcomings of the two estimators. As mentioned earlier, a good compromise might be the difference GMM estimator plus nonlinear moment conditions (Ahn-Schmidt).

2) gmm(y, lag(2 .)) is equivalent to gmm(L.y, lag(1 .)). As long as you choose the correct lag orders, it does not matter.

3) Yes, this is a binary dummy variable.

4.1) You need to instrument them either in the differenced or the level model. For the differenced model, you would normally also specify those dummies in differenced form and for the level model in level form, in order to maximize the correlation of the instruments with the regressors.

4.2) Yes, whether this is still called a difference GMM estimator is a different question. It is neither the traditional Arellano-Bond difference GMM estimator due to the dummies in the level model, nor the traditional Blundell-Bond system GMM estimator due to the lack of instruments for the other regressors in the level model. I would just call it a GMM estimator and then explain how it is constructed. In my opinion, the terms "difference GMM" and "system GMM" are overused and often lead to confusion. It is often an excuse for not explicitly specifying how an estimator is exactly constructed.

5.1) For the FOD-transformed equations, you do not need to transform the dummies.

5.2) Yes, same as in 4.2).

6) It does not matter; see 2).

7) No, you would need to run separate tests for the system GMM estimator. (The serial correlation test would normally be expected to still pass [although it is not guaranteed to do so], but the overidentification test may reject.) The other way round, your thinking is generally correct [although in finite samples, these tests sometimes do funny things].
Leave a comment:
Zainab Mariam replied

28 Jan 2023, 15:43
Dear Professor Sebastian,

I would like to express my gratitude to you for your valuable response and time. Your cooperation and support are priceless, Professor!

1) Can I use the model and moment selection criteria (MMSC) for the Difference GMM estimator and the System GMM estimator to decide which one of them is better than the other one? If so, what if none of these two estimators (the Difference GMM and the System GMM estimators) has lower values of all criteria {i.e., the Difference GMM estimator has lower values for both the Akaike (AIC) and the Bayesian (BIC), while the System GMM estimator has a lower value for the Hannan-Quinn (HQIC)}?

2) Regarding the first GMM brackets for a dynamic panel data model, to implement the Difference GMM estimator using your xtdpdgmm command, do I have to instrument the dependent variable y itself?
xtdpdgmm L.(0/1) y L.(x1 x2 x3 x4 x5 x6 x7 x8 x9) x10, model(diff) collapse gmm(y, lag ( ))

Or

Do I have to instrument the regressor L.y (the lagged dependent variable)?
xtdpdgmm L.(0/1) y L.(x1 x2 x3 x4 x5 x6 x7 x8 x9) x10, model(diff) collapse gmm(L.y, lag( ))

3) My regression model includes the dummy variable cf that takes the value of 1 for the 3 years 2008, 2009, 2010. Is this dummy variable cf considered as a binary dummy variable which takes only values 1 or 0? this dummy variable cf takes the value of 1 for the 3 years 2008, 2009, 2010, while it takes the value of 0 for the years before 2008 and for the years after 2010.

4) To implement the Difference GMM estimator using your command ‘xtdpdgmm’, I have the following questions, please!

4.1) Is it necessary/required to instrument the dummies {cf, year, industry, and country} in the differenced model? If so, do I have to use the differenced instruments or the level instruments for these dummies in the differenced model?

4.2) Can I instrument the dummies (cf, year, industry, and country dummies) in the level model even though the Difference GMM estimator is applied? If so, do I have to use the differenced instruments or the level instruments for these dummies in the level model?

5) To implement the FOD estimator using your command ‘xtdpdgmm’, I have the following questions, please!

5.1) Is it necessary/required to instrument the dummies (cf, year, industry, and country dummies) in the differenced model? If so, do I have to use the differenced instruments or the level instruments for these dummies in the differenced model?

5.2) Can I instrument the dummies (cf, year, industry, and country) in the level model even though the FOD estimator is applied? If so, do I have to use the differenced instruments or the level instruments for these dummies in the level model?

6) For unbalanced panel data, is it better to instrument the dependent variable y itself or to instrument the lagged dependent variable (i.e., the regressor L.y)?

7) If the serial correlation and overidentification tests corresponding to the Difference GMM estimator passed, does it indicate that the System GMM estimator can be applied and its corresponding tests of serial correlation and overidentification will pass?

Is the other way round correct i.e., if the serial correlation and overidentification tests corresponding to the System GMM estimator passed, does it mean that the Difference GMM estimator can be applied and its corresponding tests of serial correlation and overidentification will pass?

I am very grateful to you for all your patience, help and effort, Professor!
Leave a comment:
Sebastian Kripfganz replied

28 Jan 2023, 06:08
1.1) The test in the "Excluding" column is a Hansen test for a model without the respective instruments. Here, for the last row labeled "model(level)", this would be a model without any of the instruments specified for the level model. In essence, this becomes a Hansen test for the difference GMM estimator. Passing this test is a prerequisite for conducting the difference-in-Hansen test for the additional level instruments. Thus, you can then move on to the "Difference" column.

1.2) If the "Excluding" test does not pass, then the "Difference" test becomes meaningless because it compares the results with the additional level instruments to a misspecified benchmark model. In this case, you would need to think about changing the regression model or the instruments for the differenced model before you can evaluate the instruments for the level model.

2.1) If the "Difference" test passes, assuming the "Excluding" test was passed as well, then there is no evidence of a violation of the additional system GMM assumption; in other words, there is no evidence that the additional instruments for the level model are invalid. Thus, you can go ahead and interpret the system GMM regression results.

2.2) If the "Difference" test is rejected, again assuming the "Excluding" test was passed before, then there is evidence that the additional instruments for the level model are invalid. You might have to remove some or all of them to obtain a consistent estimator.

3) Eventually, the "Difference" test is the relevant test for the Blundell-Bond assumption.

4) With the difference GMM estimator, the difference-in-Hansen test can still be useful to evaluate the validity of specific instrument sets. This could for example help to decide whether variables should be classified as endogenous, predetermined, or exogenous; see the model selection section of my presentation.

5) The reliability of the difference GMM estimator depends on the true autoregressive coefficient is unknown; but this true coefficient is unknown.

6) First of all, ask yourself if there are any theoretical arguments in favor or against the system GMM assumption. If there are no such theoretical arguments against it, you can then use the difference-in-Hansen test.
Leave a comment:
Zainab Mariam replied

25 Jan 2023, 08:48
Dear Professor Sebastian,

I extend infinite thanks and gratitude for your valuable reply and time, professor! Please, if I may follow up with your response!

1) If we check the last row labelled “model(level)” in the “Excluding” column of the outcomes table of the difference-in-Hansen test, I have the following questions, please!

1.1) What does it mean if this test passes with a sufficiently high p-value? And what to do if this test passes?

1.2) What does it mean if this test does not pass because the p-value is small? And what to do if this test does not pass?

2) When we move on to the “Difference” column of the last row labelled “model(level)” in the outcomes table of the difference-in-Hansen test, I have the following questions, please!

2.1) What does it mean if the p-value is high? And what to do if this p-value is high?

2.2) What does it mean if the p-value is small? And what to do if this p-value is small?

3) Is the column headed “Excluding” the one which is responsible to show if the variables satisfy/violate the additional Blundell-Bond assumption (sufficient: mean stationarity)? Or is the column headed “Difference” the one which is responsible to show if the variables satisfy/violate the additional Blundell-Bond assumption (sufficient: mean stationarity)?

4) If the Difference GMM estimator is applied, do I still need to perform the Difference-in-Hansen test? If so, why? i.e., what is the implication of (the rationale behind) performing the Difference-in-Hansen test when the Difference GMM estimator is applied?

5) Regarding your post #504 point 3) “… A low estimate of the autoregressive coefficient based on the difference GMM estimator does not by itself provide confidence that the true coefficient is indeed low as well, precisely because a low estimate might be a consequence of strong bias when the true coefficient is large …”.

Thus, my question is: do you mean that the true coefficient is the lagged dependent variable’s coefficient which is obtained by running the independent variables included in the regression model using your command xtdpdgmm to apply the difference GMM estimator?

6) Regarding your post #504 point 3) “… If you are confident that the system GMM assumptions are satisfied…”.

Thus, my question is: How to check that the system GMM assumptions are satisfied?

I am very grateful to you for all your help and effort, and I do appreciate your cooperation, support and patience, professor!
Leave a comment:
Sebastian Kripfganz replied

25 Jan 2023, 05:08
1) I understand your confusion. This aspect can become quite technical when we look at the details. First of all, mean stationarity is a sufficient condition for the system GMM validity, but not a necessary one. In the simplest possible AR(1) model with no independent variables, an autoregressive coefficient equal to 1 implies nonstationarity of the dependent variable. Neither the difference nor the system GMM estimator work in this extreme case. However, for values of the autoregressive coefficient close to 1 (but below 1, and therefore in the stationary region), the difference GMM estimator still breaks down due to weak instruments, while the system GMM estimator will perform much better; see Blundell and Bond (1998, Journal of Econometrics). With additional independent variables, things become a bit more complicated because the stationarity properties of the dependent variable also depend on the stationarity of the independent variables (and not just the value of the autoregressive parameter).

2) You could use the underid command to test for underidentification, which is closely related to weak instruments; see slides 43 and following of my presentation.

3) This is a tricky part. A low estimate of the autoregressive coefficient based on the difference GMM estimator does not by itself provide confidence that the true coefficient is indeed low as well, precisely because a low estimate might be a consequence of strong bias when the true coefficient is large. If you are confident that the system GMM assumptions are satisfied (which does not depend on the specific value of the autoregressive coefficient as long as it is smaller than 1), you could check the estimate from the system GMM estimator. Alternatively, you could use the nonlinear Ahn and Schmidt (1995, Journal of Econometrics) estimator, which also mitigates the weak-instruments problem but does not require the additional system GMM assumptions.
Leave a comment:

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: