Help needed for xtabond2 codes and classification

Yui Chan

Join Date: Aug 2025

Posts: 1
#1

Help needed for xtabond2 codes and classification

11 Aug 2025, 13:56

Greetings,

I'm a new user to Stata that I have never coded before and am using it for data analysis now.
I am using a balanced panel of N>T with no missing data.
After checking for reg and xtreg, I noticed it is better to use xtabond2 twostep to estimate my model.
I'm trying to examine the determinants of dividends post COVID with the following model:
Div=a1Div(t-1) + a2Div(t-2) + bjXj + cjXj*C + YEAR + u, with Xs being explanatory variables (and are all supposed to be significant before interaction), and C is a COVID dummy and controlled with YEAR; all variables are ln transformed
The code I used generated the closest to the theoretical result.
xtabond2 ln_DIV L(1/2).ln_DIV ln_ROA ln_LEV ln_ASSET ln_ROA_C ln_LEV_C ln_ASSET_C ib2019.YEAR, twostep robust small gmm(L.ln_DIV ln_ASSET , collapse) gmm(ln_ASSET, p lag(1 .) equation(level) collapse) iv(ln_ROA ln_LEV ib2019.YEAR, equation(level))

Here is another code that provided both significant results and passed the tests (attachment file). I feel like something is off about the code, maybe the position of var in iv() or gmm(), but I am not sure
xtabond2 ln_DIV L(1/2) ln_DIV ln_ROA ln_LEV ln_ASSET ln_ROA_C ln_LEV_C ln_ASSET_C ib2019.YEAR, twostep robust small gmm(L.ln_DIV , collapse) gmm(ln_ASSET , p lag(1 .) equation(level) collapse) iv(ln_ROA_C ln_LEV_C ln_ASSET_C ln_ROA ln_LEV ib2019.YEAR, equation(level))

I'm not sure if the var after the comma is in the right place, and another concern is ln_ASSET is persistent, so i dont really know where to put it

edit: codes that give significant result fails Diff in sargan test and vice versa.

Please give me some advices and any changes to the code or explaination to how the things after the comma works
Many thanks in advance

yours faithfully,
Yui
Attached Files

Last edited by Yui Chan; 11 Aug 2025, 14:27.
Tags: Suggestion, syntax, xtabond2
Sebastian Kripfganz

Join Date: May 2014

Posts: 2606
#2

12 Aug 2025, 04:15

Main advise: You should not search for specifications that give you significant results. That's data mining, or p-hacking, or however you want to call it. It's not acceptable as good academic practice.

You should specify the model according to the underlying theory (e.g., your theory should tell you which variables might be exogenous, predetermined, or endogenous), possibly combined with specification tests. Eventually, nobody here can really tell you where to put your variables, because that needs to come out of the theory behind your application. Before you play around too much with advanced estimation commands, it is crucial that you understand the underlying econometric theory.

The following presentation might be a good starting point, with additional references therein:
Kripfganz, S. (2019). Generalized method of moments estimation of linear dynamic panel data models. Proceedings of the 2019 London Stata Conference.

https://www.kripfganz.de/stata/
Comment

Announcement

Help needed for xtabond2 codes and classification

Comment