xtabond2 vs xtdpdml

Prateek Bedi

Join Date: Sep 2018

Posts: 199
#1

xtabond2 vs xtdpdml

23 Oct 2018, 06:22

Hi,

I am working on a dynamic panel model with ~700 cross-sectional units and 16 years of data. Since I am a beginner with STATA and dynamic panel econometrics, I have been studying application of xtabond2 which I believe is suitable for my analysis (extant literature also uses the same). However, I recently came across this link: https://www3.nd.edu/~rwilliam/dynamic/ which proposes dynamic panel estimations with Maximum Likelihood instead of GMM. I have the following queries:

1. What advantages does xtdpdml have over xtabond2?
2. How do I choose between the two?
3. Are there any points of caution to keep in mind while using xtdpdml (I ask this keeping in view the limited content available on xtdpdml and that no paper in my domain has used it yet)?

Thanks!
Tags: None
Sebastian Kripfganz

Join Date: May 2014

Posts: 2587
#2

23 Oct 2018, 10:43

There is not a short and easy answer to your question. It particularly depends on the model you have in mind. You should first specify the model; the choice of the estimator comes second. The model specification includes questions like
Are your regressors strictly exogenous, predetermined, endogenous?

If your regressors are endogenous, are their own lags valid instruments or do you need external instruments?

How does the error structure of your model look like? Does your model have a standard error components structure with a "fixed effect" and a serially uncorrelated idiosyncratic error term or is the idiosyncratic error allowed to be correlated over time?

On Richard Williams' website, you can find several papers that describe the xtdpdml command and its underlying model assumptions. You should have a look at them first.

You can find more on GMM estimation of linear dynamic panel data models here:
XTDPDGMM: new Stata command for efficient GMM estimation of linear (dynamic) panel models with nonlinear moment conditions

https://www.kripfganz.de/stata/
1 like
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4983
#3

23 Oct 2018, 11:25

The first paper using xtdpdml just came out in Summer 2017, so it hasn't had time to take over the world yet. But as Sebastian notes, you can read about it at

https://www3.nd.edu/~rwilliam/dynamic/

I will note that xtdpdml is for large N/ small T problems. Somebody just asked me about a problem with T = 97/ N = 82. There is no way xtdpdml can estimate a model like that. If you reduce it to T =10 you might have a chance.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
1 like
Comment
Prateek Bedi

Join Date: Sep 2018

Posts: 199
#4

24 Oct 2018, 04:02

Originally posted by Sebastian Kripfganz View Post

There is not a short and easy answer to your question. It particularly depends on the model you have in mind. You should first specify the model; the choice of the estimator comes second. The model specification includes questions like
Are your regressors strictly exogenous, predetermined, endogenous?

If your regressors are endogenous, are their own lags valid instruments or do you need external instruments?

How does the error structure of your model look like? Does your model have a standard error components structure with a "fixed effect" and a serially uncorrelated idiosyncratic error term or is the idiosyncratic error allowed to be correlated over time?

On Richard Williams' website, you can find several papers that describe the xtdpdml command and its underlying model assumptions. You should have a look at them first.

You can find more on GMM estimation of linear dynamic panel data models here:
XTDPDGMM: new Stata command for efficient GMM estimation of linear (dynamic) panel models with nonlinear moment conditions

Alright. Firstly, thanks a lot for your valuable response. So my model is like this:

Dependent Variable: CashHoldings
Independent Variables: FirmSize Leverage NetWorkingCapital Profitability MV/BV CashFlow Dividend CapitalExpenditure CashFlowVol FirmDiversification PromoterShares i.Year
Regarding regressors, I believe the model has a combination of exogenous, predetermined, endogenous variables (I understand this is a theoretical call which I have to take with my supervisor)

For endogenous regressors, I believe their own lags can serve as instruments. However, I request you to guide me if there's a statistical way to test for it or is this also a theoretical call?

The model indeed has firm level fixed effects. Infact, I also need to introduce time dummies for cross-sectional invariant variables (i.e. macroeconomic variables).

Regarding the idiosyncratic error term, it is allowed to be correlated over time.

Further, I had earlier used fixed effects model (without the lagged dependent variable) and found presence of heteroskedasticity (xttest3) and serial correlation (using xtserial).

I request you to guide me further for model selection.

Thanks!
Comment
Prateek Bedi

Join Date: Sep 2018

Posts: 199
#5

24 Oct 2018, 04:07

Originally posted by Richard Williams View Post

The first paper using xtdpdml just came out in Summer 2017, so it hasn't had time to take over the world yet. But as Sebastian notes, you can read about it at

https://www3.nd.edu/~rwilliam/dynamic/

I will note that xtdpdml is for large N/ small T problems. Somebody just asked me about a problem with T = 97/ N = 82. There is no way xtdpdml can estimate a model like that. If you reduce it to T =10 you might have a chance.

Thanks a lot, Sir. Really appreciate your reply. I think since my model has t>10, it would be appropriate if I don't apply xtdpdml. However, I would like to congratulate and thank you for creating xtdpdml. I read the content on your website and fount it useful.
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2587
#6

24 Oct 2018, 07:09

Once you have specified your initial model and estimated it by GMM, you can test the validity of the instruments with a Hansen or Difference-in-Hansen overidentification test. Depending on the outcome of the test, you might have to rethink some of your model assumptions.

You can include time-fixed effects but be aware that there is a bug in xtabond2 that yields incorrect degrees of freedom and p-values for the overidentification tests if you specify the time dummies with the factor variable notation. You can avoid that problem with the xtdpdgmm command.

Allowing the idiosyncratic error term to be correlated over time is problematic because it implies that the lags of the dependent variable might not be valid instruments.

Heteroskedasticity is not a problem. Just use the two-step GMM estimator with robust standard errors.

https://www.kripfganz.de/stata/
Comment
Prateek Bedi

Join Date: Sep 2018

Posts: 199
#7

24 Oct 2018, 07:54

Thanks a lot, Sebastian Kripfganz. This is really helpful. Could you please explain the problem related to time dummies in xtabond2? How should I input them in the model? I know that in FE, we input them by the command , i.Year. What changes do I need to make?

Also, given the characteristics of my model, how do I choose b/w xtabond2 and xtdpdgmm?

Is there any way out to serial correlation in idiosyncratic error term?

Thanks and Regards
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2587
#8

24 Oct 2018, 08:26

Coefficients that are displayed as omitted in the estimation output are still counted by xtabond2 as if they were estimated. This implies that the degress of freedom for the overidentification tests are too small by the number of omitted coefficients. If you specify the dummies with the factor notation, xtabond2 automatically includes too many of them. You need to create new variables for the time dummies first and then specify them manually as regressors, making sure that none of them gets omitted. (Avoid the dummy trap!)

Alternatively, you can use the xtdpdgmm command with its teffects option that automatically generates the correct number of time effects. In principle, you can do most things with xtdpdgmm thay you could also do with xtabond2. The commands differ in some aspects that are not really relevant if you just want to run a conventional GMM estimation. If you specify the commands correctly, you should obtain the same results with both of them.

If there is serial correlation in your error term, one approach would be to add additional lags of the dependent variable as regressors. Ideally, this might remove the serial correlation from the error term. Alternatively, you could start with deeper lags as instruments but this might lead to a weak instruments problem.

https://www.kripfganz.de/stata/
Comment
Prateek Bedi

Join Date: Sep 2018

Posts: 199
#9

25 Oct 2018, 04:23

Thanks a lot Sebastian Kripfganz. I would also like to know about xtdpdsys and how does it differentiate from xtbond2 and xtdpdgmm. Given the attributes of my model in #4, which command out of these 3 should I study in detail for implementation?

Thanks!
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2587
#10

25 Oct 2018, 04:42

xtdpdsys is Stata's official command for linear dynamic system-GMM panel estimation. Everything you can do with xtdpdsys can be done with xtabond2 or xtdpdgmm as well, but xtdpdsys has serious limitations. For example, you cannot collapse the instruments matrices which is one of the easiest ways to avoid the problem of too many instruments. Furthermore, xtdpdsys does not compute a Hansen overidentification test. Because of these two shortcomings, I do not recommend to use xtdpdsys.

As I am the author of xtdpdgmm, you can imagine which command I would recommend but I am clearly biased on that matter.

https://www.kripfganz.de/stata/
Comment
Prateek Bedi

Join Date: Sep 2018

Posts: 199
#11

25 Oct 2018, 10:27

Originally posted by Sebastian Kripfganz View Post

xtdpdsys is Stata's official command for linear dynamic system-GMM panel estimation. Everything you can do with xtdpdsys can be done with xtabond2 or xtdpdgmm as well, but xtdpdsys has serious limitations. For example, you cannot collapse the instruments matrices which is one of the easiest ways to avoid the problem of too many instruments. Furthermore, xtdpdsys does not compute a Hansen overidentification test. Because of these two shortcomings, I do not recommend to use xtdpdsys.

As I am the author of xtdpdgmm, you can imagine which command I would recommend but I am clearly biased on that matter.

Thanks a lot Sebastian Kripfganz for your honest and helpful guidance!! Really appreciate!! And thanks also for xtdpdgmm.
Comment
John Sgr

Join Date: Sep 2020

Posts: 28
#12

14 Jun 2021, 04:36

Dear Sebastian,

Is it possible to use xtdpdgmm package in case we want to test mediation role of Z between Y and X while keeping the lagged endogenous variables in the equation? Also, I did not fully get whether xtdpdml package enable us to define mediator variables like SEM estimator or is it just using the computation style of SEM for dynamic GMM. Any guess?
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2587
#13

14 Jun 2021, 04:46

Mediation analysis involves the estimation of multiple equations. Both xtdpdgmm and xtdpdml are for single-equation estimation. They do not have any functionality to explicitly define any mediator variables. You could possibly still estimate the equations separately with those two commands and then compute the desired effects manually.

https://www.kripfganz.de/stata/
Comment
John Sgr

Join Date: Sep 2020

Posts: 28
#14

15 Jun 2021, 04:00

I see, so it is not possible to run bootstap test after xtdpdgmm command. What could be an alternative?
Comment

Announcement

xtabond2 vs xtdpdml

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment