Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • I am afraid the new option introduced in post #689 was released prematurely. To properly compute (Windmeijer-corrected) standard errors and for some postestimation tests, knowing the first-step residuals is not enough. I have thus decided to replace the wmatrix(residuals) option by a new option called first(), which lets you specify the name of stored estimation results, so that xtdpdgmm can obtain all the information it needs. Please see the help file for details on the syntax of this new option.

    The previous example now no longer works, but the following code serves the same purpose:
    Code:
    webuse abdata
    xtdpdgmm L(0/1).n w k, gmm(L.n w k, lag(1 4) collapse model(diff)) vce(robust)
    estimates store diffgmm
    xtdpdgmm L(0/1).n w k, gmm(L.n w k, lag(1 4) collapse model(diff)) iv(L.n w k, diff) first(diffgmm) vce(robust)
    To update xtdpdgmm to version 2.7.2, as always type:
    Code:
    net install xtdpdgmm, from(http://www.kripfganz.de/stata/) replace
    Last edited by Sebastian Kripfganz; 18 Jul 2025, 07:06.
    https://www.kripfganz.de/stata/

    Comment


    • Dear Sebastian,

      thank you very much for your service and help to the community.

      If I may, I would like to ask three questions regarding the xtdpdgmm command and its relation to xtabond2. I am new to dynamic panel models as well as to STATA, so please excuse any fundamental misunderstanding my questions might reveal.

      1) I have a large panel data set, depending on the specification 700,000 to 850,000 observations (highly unbalanced with sometimes many gaps per individual). In this environment I find xtabond2 to be much quicker than xtdpdgmm. I often have to break the code from running with xtdpdgmm, while it usually takes an okay-amount of time with xtabond2. Is this behaviour expected or am I overlooking something (eg some speed-up options) with xtdpdgmm?

      2) My data is survey data and there are weights for each observation to make the sample representative. In xtabond2, there is the option to include survey weights, while in xtdpdgmm I do not see such a functionality, is this correct?

      3) I read that with unbalanced data, forward deviations might be a more appropriate model transformation but that there might be issues of doing that in xtabond2, did I see that correctly and if yes, how severe is the problem?

      Thank you very much in advance and kind regards,
      Andreas

      Comment


      • 1) Admittedly, xtdpdgmm can be slow with such large data sets. It is not optimized in this regard. The purpose of xtdpdgmm is to provide a lot of additional flexibility. However, this flexibility increases some computational overhead that may not be needed in many situations. I am sure there are ways to program xtdpdgmm more efficiently, but that's not going to happen in the near future due to my time constraints.

        2) Weights are not (yet) available with xtdpdgmm. As above, I cannot make any promise about implementing them.

        3) I believe that there should not be any relevant issue specifically with forward-orthogonal deviations in the latest version of xtabond2 (version 3.7.2). You could just check with a small subset of your data set whether you can replicate the xtdpdgmm results with xtabond2. If you can, then you can be reassured and use xtabond2 on the large data set.

        EDIT: Make sure to check whether xtabond2 uses the appropriate lags for the instruments with forward-orthogonal deviations!
        Last edited by Sebastian Kripfganz; 16 Sep 2025, 06:56.
        https://www.kripfganz.de/stata/

        Comment


        • Dear Sebastian,

          thank you so much for your support, it helps tremendously.

          Regarding your point w.r.t. the lags of the instruments in the forward-deviations case in xtabond2, I ran the following example code:

          Code:
          webuse abdata, clear 
          
          // model 1
          xtabond2 L(0/1).n L(0/1).w L(0/1).k, gmm(n w k, lag(3 5)) nol orthogonal twostep robust 
          
          // model 2
          xtabond2 L(0/1).n L(0/1).w L(0/1).k, gmm(n w k, lag(2 4)) nol orthogonal twostep robust 
          
          // model 3
          xtdpdgmm L(0/1).n L(0/1).w L(0/1).k, gmm(n w k, l(2 4) m(fodev)) twostep vce(robust)
          Based on the typed commands, I would have expected models 2 and 3 to give identical results but actually models 1 and 3 give the identical results (at least the coefficients, which is what I checked). So this is what you referred to with your edit, right?

          Thank you again very much!!

          Best wishes,
          Andreas

          Comment


          • Yes, this is the problem with xtabond2. If you know how to correctly specify the lags, everything is fine. However, the lag specification is counterintuitive because xtabond2 internally shifts the observations by 1 period when computing forward-orthogonal deviations.
            https://www.kripfganz.de/stata/

            Comment


            • Dear Prof. Kripfganz,

              I am working with firm-level panel data with large N and small/moderate T (T = 10-20) and a dynamic specification including a lagged dependent variable. Several regressors are plausibly predetermined or endogenous (rather than strictly exogenous), so I am using Arellano–Bond / Arellano–Bover–Blundell–Bond type GMM as the baseline.

              Could you please point me to recent methodological advances (say, post-2020) that you consider particularly important for this setting, especially work addressing (i) weak internal instruments under high persistence, (ii) instrument proliferation and instrument-selection/regularization, (iii) improved inference (finite-sample corrections, robust/cluster-robust variance, small-sample issues), (iv) credible diagnostics for identification strength in dynamic panel GMM, and (v) cross-sectional dependence/common shocks (beyond simply adding time effects) and how it affects estimation and inference in short dynamic panels?

              In terms of implementation, I would also be grateful if you could clarify which of these issues/diagnostics and recommended practices are directly supported within Stata’s xtdpdgmm (via its options and postestimation output), and which typically require additional routines or alternative commands.

              If there are specific papers, working papers, or survey-style references you would recommend (or practical guidance on current best practices for applied work), I would be very grateful.

              Thank you in advance for your time.

              Best regards,
              Filip

              Comment


              • Dear Prof. Kripfganz,

                I ran the following command to find out the determinants of corporate cash holdings for an unbalanced dataset of 1696 firms over 16 years.

                Code:
                xtdpdgmmfe CashHolding3_w Leverage2_w Size1_w Liquidity3_w DividendDummy PromoterSharesin1_w GrowthPotential1_w CapitalExpenditure2_w OperatingCashflow_w Profitability1_w Cashflowvol1_w RD1_w  , lags(1) endogenous(Leverage2_w  Liquidity3_w DividendDummy PromoterSharesin1_w GrowthPotential1_w CapitalExpenditure2_w OperatingCashflow_w Profitability1_w Cashflowvol1_w RD1_w )  exogenous( Size1_w)  teffects twostep vce(cluster firm_id ) collapse curtail(2) stationary nofootnote nonl small
                Then, I uses the following command to obtain the predicted values of dependent variable. My purpose is to find out whether a firm holds excess or deficit of cash by comparing the actual cash holdings with predicted cash holdings.

                Code:
                 predict PCash if e(sample)
                However, I want to know if “fitted values” are even appropriate in a dynamic panel System-GMM model. Because of the fact that dynamic panel estimation does not estimate individual fixed effects explicitly, uses instrumented lagged variables, combines equations in first differences and levels and estimates parameters using moment conditions (not least squares), I am not sure if calculating predicted values in system-GMM estimator would make any sense or not. Although the 'predict' command will provide some result, is it even logical to serve my objective to find out whether a firm holds excess or deficit of cash?

                Thanks in advance.

                Comment

                Working...
                X