When to use GMM, sequential estimation, lag selection, and reporting

Man Khanh Chau

Join Date: Feb 2026

Posts: 5
#1

When to use GMM, sequential estimation, lag selection, and reporting

10 Mar 2026, 04:18

Dear Statalist,

I have a few related questions about handling the lagged dependent variable in dynamic panel models using xtabond2 (two-step system GMM). I would greatly appreciate your insights, especially regarding standard practice in applied econometrics papers.
If economic theory or prior evidence clearly indicates that the dependent variable is persistent (i.e., the lagged dependent variable L.y is theoretically important and should be included), is it best practice to estimate the model directly using two-step system GMM with xtabond2 from the beginning? Or is it still common/recommended to first run simpler static models (pooled OLS, fixed effects, random effects) without the lagged dependent variable, even if persistence is expected?

Many papers seem to follow this sequence:
First estimate static panel models (pooled OLS / FE / RE) without L.y

Then add L.y and switch to dynamic GMM (xtabond2) to address dynamic endogeneity and Nickell bias.
Is this sequential approach methodologically sound and widely accepted? Or is it problematic/inconsistent because static models ignore persistence from the start, and one should prefer dynamic GMM directly when autocorrelation in the dependent variable is anticipated?

In xtabond2 (system GMM), when instrumenting the lagged dependent variable and endogenous regressors (e.g., gmm(L.y ..., lag(2 .) collapse)), is there usually only one or more specific lag structures (at one particular lag level) that makes the model "valid" (i.e., passes all diagnostic tests properly)?

In the methodology section of a paper or thesis, how should one clearly describe the treatment of the lagged dependent variable?
What level of detail do referees typically expect?

Many thanks in advance for your guidance - these issues seem quite common but the "best practice" is not always crystal clear from reading papers.

Best regards,
Mẫn Khanh Châu
Tags: None
Manh Hoang Ba

Join Date: Aug 2023

Posts: 87
#2

21 Mar 2026, 22:11

Please note that I am not a reviewer, so the following are just some responses based on my personal experience:

1) If persistence is a clear property of the dependent variable, we should start with a model that includes a lag of the dependent variable as an explanatory variable. Otherwise, the model may face poor inference due to autocorrelation residuals (in the most optimistic case) or, more seriously, obtain biased/inconsistent estimates due to endogeneity issues arising from the backfeedback of y to the future values of the explanatory variables.

2) Regarding the analysis process: We can start by estimating a static model (POLS, FEM, REM), then use DGMM/SGMM to control for issues such as autocorrelation, heteroskedasticity, and endogeneity. However, if we are certain about the robustness of the dependent variable and add its lagged variable as an explanatory variable, the Nickell bias makes the useful information from the results of OLS, FEM, and REM very limited. I think we should ignore them and use the DGMM/SGMM method directly.

3) There are many valid lagged structures for constructing the instrument matrix. One can change the lagged structure to test the sensitivity and robustness of the results. Some information that is recommended to be reported when using DGMM/SGMM includes: classification of explanatory variables (endogenous, exogenous, predetermined), lagged structure used in the instrument matrix, number of instrument variables used, Hansen/Sargan test, AB autocorrelation test for difference errors, ... (see Roodman, 2008). However, most papers using this method present the findings vaguely, or even fail to report the number of additive variables and the results of the AB-AR1 test. In this regard, I think Kiviet's (2020) classification procedure would be helpful.

Kiviet, J. F. (2020). Microeconometric dynamic panel data methods: Model specification and selection issues. Econometrics and Statistics, 13, 16-45.
Roodman, D. (2009). How to do xtabond2: An introduction to difference and system GMM in Stata. The stata journal, 9(1), 86-136.

Manh Hoang-Ba,
Facebook,
Eureka! Uni - YouTube,
ManhHB94 (Manh Hoang Ba),
Hoàng Bá Mạnh – Kinh tế lượng: Lý thuyết và ứng dụng
Comment

Announcement

When to use GMM, sequential estimation, lag selection, and reporting

Comment