Dear Statalist,
I have a few related questions about handling the lagged dependent variable in dynamic panel models using xtabond2 (two-step system GMM). I would greatly appreciate your insights, especially regarding standard practice in applied econometrics papers.
Best regards,
Mẫn Khanh Châu
I have a few related questions about handling the lagged dependent variable in dynamic panel models using xtabond2 (two-step system GMM). I would greatly appreciate your insights, especially regarding standard practice in applied econometrics papers.
- If economic theory or prior evidence clearly indicates that the dependent variable is persistent (i.e., the lagged dependent variable L.y is theoretically important and should be included), is it best practice to estimate the model directly using two-step system GMM with xtabond2 from the beginning? Or is it still common/recommended to first run simpler static models (pooled OLS, fixed effects, random effects) without the lagged dependent variable, even if persistence is expected?
- Many papers seem to follow this sequence:
- First estimate static panel models (pooled OLS / FE / RE) without L.y
- Then add L.y and switch to dynamic GMM (xtabond2) to address dynamic endogeneity and Nickell bias.
Is this sequential approach methodologically sound and widely accepted? Or is it problematic/inconsistent because static models ignore persistence from the start, and one should prefer dynamic GMM directly when autocorrelation in the dependent variable is anticipated?
- In xtabond2 (system GMM), when instrumenting the lagged dependent variable and endogenous regressors (e.g., gmm(L.y ..., lag(2 .) collapse)), is there usually only one or more specific lag structures (at one particular lag level) that makes the model "valid" (i.e., passes all diagnostic tests properly)?
- In the methodology section of a paper or thesis, how should one clearly describe the treatment of the lagged dependent variable?
What level of detail do referees typically expect?
Best regards,
Mẫn Khanh Châu

Comment