XTDPDGMM: new Stata command for efficient GMM estimation of linear (dynamic) panel models with nonlinear moment conditions

Nicu Sprincean

Join Date: Nov 2016
Posts: 47

#31

24 Sep 2018, 10:26

Thank you for your quick response.
I want to run a model with your command and I get an error. My model is as following:

Code:

xtdpdgmm VaR_w l.VaR_w l.ers_w l.size_w l.npl_tloan_w l.roa_w  l.equity_ta_w l.gdp_growth_w l.inflation_w l.rule_law_w l.bank_conc_w l.fin_inter_w l.fin_free_w emerg, gmmiv(l.VaR_w l.ers_w ,lagrange(2 5) collapse model(level)) iv(l.size_w l.npl_tloan_w l.roa_w l.equity_ta_w l.gdp_growth_w l.inflation_w l.rule_law_w l.bank_conc_w l.fin_inter_w l.fin_free_w emerg,model(difference)) twostep  vce(robust)

where ers (exchange rate stability) is my main regressor which as suspect of endogeneity. The error is as following:

Code:

xtdpdgmm_wmat():  3200  conformability error
          xtdpdgmm_est():     -  function returned error
                 <istmt>:     -  function returned error
r(3200);

Thank you.

Comment

Sebastian Kripfganz

Join Date: May 2014

Posts: 2588
#32

24 Sep 2018, 14:44

The bug reported by Nicu Sprincean has been fixed. Many thanks again for reporting this problem. It could occur in unbalanced panels with unusual patterns of missing observations when there were zero observations for the first differences of some groups but a positive number of observations for the corresponding levels. The updated version 1.1.3 is currently only available from my personal website:

Code:

net install xtdpdgmm, from(http://www.kripfganz.de/stata/) replace

https://www.kripfganz.de/stata/
Comment
Aziz Alomran

Join Date: Nov 2018

Posts: 1
#33

07 Nov 2018, 09:02

Dear Kripfganz Sebastian Kripfganz ,

Thank you for your this helpful command. I would like to ask if it is appropriate to use this command where my dependent variable is a binary?

Thank you in advance.
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2588
#34

07 Nov 2018, 09:30

In principle, you can use it with a binary dependent variable. The model then is called a linear probability model. Whether the usual GMM instruments for the lagged dependent variable are strong enough or might become weak is a different story that cannot be answered in a general way.

https://www.kripfganz.de/stata/
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2588
#35

21 Mar 2019, 09:28

A significant update to the xtdpdgmm package is now available for installation from my website. (For the time being, this new version is not yet available from SSC.) At least Stata 13 is now required.

Code:

net install xtdpdgmm, from("http://www.kripfganz.de/stata/") replace

The new version 2.0.3 comes with several additions and some syntax modifications that allow more flexibility:
The iterated GMM estimator suggested by Hansen, Heaton, and Yaron (1996) can now be used by specifying option igmm.

The options nl(noserial) and nl(iid) replace the previous options noserial and iid. (Previous syntax continues to work but is no longer documented.) The new syntax comes with a few more changes:
The suboption collapse can be used to reduce the T-1 (or T-2) nonlinear moment conditions into a single moment condition by adding up the time-period specific moment conditions.

The suboption weight() can be used to control the weight of the nonlinear moment conditions relative to the linear moment conditions in the initial weighting matrix. For example, weight(0) implies that the nonlinear moment conditions are effectively ignored when computing the first-step estimator.

By default, nl(iid) now differs from the previous option iid. The reason is a bit technical and can be ignored by most users: Previously, iid invoked the nonlinear moment conditions \(E [\bar{u}_i \Delta u_{it}]=0\) from equation (11b) in Ahn and Schmidt (1995), plus the linear moment conditions from equation (11a). Now, nl(iid) multiplies these nonlinear moment conditions by a group-specific factor \(\sqrt{T_i}\). This scaling factor ensures that the moment conditions remain meaningful when the time-series dimension becomes large. (Otherwise, the sample average \(\bar{u}_i\) would converge to zero, while it retains a positive variance after multiplication with the scaling factor.) In addition, the scaling factor helps to avoid that groups with fewer observations are overweighted in unbalanced panels. The old behavior can be reestablished by specifying the suboption norescale. Such a scaling factor is not needed for nl(noserial), which invokes the nonlinear moment conditions from equation (4) in Ahn and Schmidt (1995).

Models with nonlinear moment conditions are now estimated with the two-step estimator by default. To obtain the one-step estimator (which remains the default without nonlinear moment conditions), you now need to explicitly specify the new option onestep. (Note that there exists no optimal weighting matrix for the one-step estimator with nonlinear moment conditions.)

By default, models with only linear moment conditions are now estimated using the analytical solutions to the first-order conditions. To still use the numerical Gauss-Newton algorithm instead, the option noanalytic can be specified. The latter is implied whenever nonlinear moment conditions are specified.

The postestimation command predict, iv now displays a list of instruments used in the estimation and attaches meaningful labels to the generated instrumental variables. To only display the list of instruments without creating new variables, specify the option nogenerate. Furthermore, collinear instruments are now automatically removed already before the estimation to avoid a singular weighting matrix.

The postestimation command estat overid now presents two versions of the Sargan-Hansen overidentification test. The first version is computed in the conventional way based on the weighting matrix from the last estimation step. The second version updates the weighting matrix another time, i.e. it uses the weighting matrix computed with the second-step rather than first-step residuals after two-step estimation. For the two-step and iterated GMM estimator, both versions are asymptotically equivalent but their finite-sample properties might differ. For the one-step GMM estimator, both versions are asymptotically invalid unless the initial weighting matrix is already optimal. (Updating the weighing matrix is not sufficient because the coefficients are still based on the inefficient initial weighting matrix.)

The conventional one-step VCE can now be computed with an estimated variance parameter from the residuals in deviations from within-group means or forward-orthogonal deviations by using the new vce() suboption model(). Computation of the variance parameter from the first-differenced residuals is still possible. (The previous suboption difference is now redundant and no longer documented.) By default, the variance parameter is computed from the level residuals. However, in most situations it is recommended to use the robust VCE that does not rely on this variance parameter.

The default setting for the suboption model() of the iv(), ivgmm() and vce() options can now be changed with the global option model(). This option is only recommended for experienced users. To avoid accidental model misspecification, I recommend to always explicitly specify the subtoption model() for the instrumental variables.

When creating the instruments for some transformed models, the deviations from within-group means and the forward-orthogonal deviations, a similar rescaling factor as the one mentioned above is used to retain the same variance of the error term as before the transformation (under an iid assumption). See for example Arellano and Bover (1995). This was already the case in earlier versions. In this new version, the suboption norescale of the iv() and gmmiv() options can be used to avoid this rescaling. The default setting can also be changed for all instruments altogether with the global option norescale. This can be safely ignored by most users.

For detailed descriptions and examples, please see

Code:

help xtdpdgmm help xtdpdgmm postestimation

I am grateful to Mark Schaffer for his comments that motivated some of the changes.

Referenced literature:
Ahn, S. C., and P. Schmidt (1995). Efficient estimation of models for dynamic panel data. Journal of Econometrics 68, 5-27.

Arellano, M., and O. Bover (1995). Another look at the instrumental variable estimation of error-components models. Journal of Econometrics 68, 29-51.

Hansen, L. P., J. Heaton, and A. Yaron (1996). Finite-sample properties of some alternative GMM estimators. Journal of Business & Economic Statistics 14, 262-280.

Last edited by Sebastian Kripfganz; 21 Mar 2019, 09:35.

https://www.kripfganz.de/stata/
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2588
#36

23 Apr 2019, 07:42

I have updated the program to version 2.0.4, primarily to fix a bug that occured when cluster-robust standard errors were requested.

https://www.kripfganz.de/stata/
Comment
Prateek Bedi

Join Date: Sep 2018

Posts: 199
#37

17 May 2019, 06:12

Hello,

I am trying to create a squared term of an independent variable in the command while using xtdpdgmm. I used the following code to create a squared term for variable named 'PO'.

Code:

c.PO#c.PO

Although this factor notation worked in other commands, it somehow does not seem to work with xtdpdgmm. Any help in this regard is highly appreciated.

Thanks
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2588
#38

17 May 2019, 06:18

xtdpdgmm indeed currently does not work with factor variables. But you can just use the xi: prefix command:

Code:

xi: xtdpdgmm ...

Alternatively, you could create the squared term as a new variable separately before calling xtdpdgmm. (Note that margins is not currently supported either after xtdpdgmm.)

https://www.kripfganz.de/stata/
1 like
Comment
Prateek Bedi

Join Date: Sep 2018

Posts: 199
#39

26 May 2019, 08:58

I tried using the

Code:

xi: xtdpdgmm ...

prefix command with factor notations. However, I got the following error.

cWPromoterSharesin1#c: operator invalid
r(198);

My objective is to run the following model using xtdpdgmm:

Code:

xtdpdgmm Cash L.Cash Size Leverage PromoterShares c.PromoterShares#c.PromoterShares, teffects twostep vce(cluster CompanyID) gmmiv(L.Cash lag(1 1) model(fodev)) gmmiv(Leverage Liquidity lag(1 4) collapse model(fodev)) iv(PromoterSharesin c.PromoterShares#c.PromoterShares, model(level))

Apart from creating a new separate variable, is there any other way to use squared term of an independent variable or to multiply two independent variable while using xtdpdgmm command?

Your help is highly appreciated.
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2588
#40

26 May 2019, 09:59

xi: only works with indicator variables, not interactions of continues variables. Currently, there is no alternative to creating a new separate variable.

https://www.kripfganz.de/stata/
1 like
Comment
Prateek Bedi

Join Date: Sep 2018

Posts: 199
#41

28 May 2019, 05:46

Thanks a lot for this clarification, Prof. Sebastian!!
Comment

Lars Kleinhuis

Join Date: May 2019
Posts: 1

#42

28 May 2019, 09:01

Hello,

I am currently working using the xtabond2 command and wish to extend my analysis using non-linear moment conditions. However, there seems to be some trouble when replicating my results from xtabond2 to xtdpdgmm. The current setup is as follows:

Code:

xtabond2 GDPgr l1.lnGDPpc rem FDindex intrac lncap lnhcap lnwpop lntrade lngov infl t*, ///
        gmm(lnGDPpc rem FDindex intrac, lag(0 2) eq(both)) ///
        gmm(lncap lnhcap infl lntrade, la(1 2) eq(both)) ///
        iv(lnwpop lngov t*, equation(level)) ///
        twostep robust orthogonal

xtdpdgmm GDPgr L1.lnGDPpc rem FDindex intrac lncap lnhcap lnwpop lntrade lngov infl, ///
        gmm(lnGDPpc rem FDindex intrac, bodev la(0 2) mo(fodev)) ///
        gmm(lncap lnhcap infl lntrade, bodev la(1 2) mo(fodev)) ///
        iv(lnwpop lngov, mo(level)) ///
        teffects twostep vce(robust)

I suspect there is a mistake in the way I have specified the iv and gmm brackets. Hence, the question: How do I solve this mistake? Or, in other words: How do I replicate the results of the xtabond2 command?

Thank you kindly in advance!

Comment

Sebastian Kripfganz

Join Date: May 2014

Posts: 2588
#43

28 May 2019, 10:57

First of all, you have specified the eq(both) option with xtabond2 while you only specified the GMM-type instruments for the transformed model with xtdpdgmm. Also, you specified the backward-orthogonal deviations with xtdpdgmm but did not specify the corresponding gmm() suboption orthogonal with xtabond2.

The bigger problem is that forward-orthogonal deviations are implemented in xtabond2 in a quite problematic way, see my comment in another Statalist topic and further up in this topic.

The following two specifications should yield identical results (besides the intercept, which already hints at the fundamental problem):

Code:

xtabond2 GDPgr l1.lnGDPpc rem FDindex intrac lncap lnhcap lnwpop lntrade lngov infl, /// gmm(l1.lnGDPpc rem FDindex intrac, orthogonal lag(1 3) eq(diff)) /// gmm(lncap lnhcap infl lntrade, orthogonal lag(2 3) eq(diff)) /// twostep robust orthogonal xtdpdgmm GDPgr L1.lnGDPpc rem FDindex intrac lncap lnhcap lnwpop lntrade lngov infl, /// gmm(l1.lnGDPpc rem FDindex intrac, bodev lag(0 2) mo(fodev)) /// gmm(lncap lnhcap infl lntrade, bodev lag(1 2) mo(fodev)) /// twostep vce(robust)

If you then try to add time dummies or instruments for the model in levels, the results will no longer match as a consequence of the problematic implementation in xtabond2 (and as far as I can see there is no way to replicate the xtabond2 results). I call it a bug in xtabond2 and I would strongly advise not to use the xtabond2 command with forward-orthogonal deviations.

As another remark, the lag 0 of lnGDPpc is endogenous and not a valid instruments. You can use lag 0 of L1.lnGDPpc or start from lag 1 of lnGDPpc. (With xtabond2, it is even worse. Given how it is implemented, the first lag would still be endogenous [because it is actually not the first lag but lag 0, to maximize the confusion].)

Last edited by Sebastian Kripfganz; 28 May 2019, 11:00.

https://www.kripfganz.de/stata/
1 like
Comment

Sebastian Kripfganz

Join Date: May 2014
Posts: 2588

#44

19 Jun 2019, 13:55

Another update to version 1.2.0 with a significant improvement is now available for installation from my website:

Code:

net install xtdpdgmm, from("http://www.kripfganz.de/stata/") replace

The postestimation command estat overid has the new option difference that reports Sargan-Hansen difference statistics for a subset of the moment conditions. Unlike the already existing possibilty to compute these difference statistics by taking the simple difference between the Sargan-Hansen test statistics from two separately estimated models, those computed with the new option are guaranteed to be nonnegative because all estimates are based on the same weighting matrix as the full model.

(The two ways of computing the Sargan-Hansen difference tests are asymptotically equivalent. Aside from the fact that the new version is guaranteed to nonnegative, it is not a priori clear which of the two versions has a better finite-sample performance.)

Importantly: You need to first specify the new option overid with the xtdpdgmm command. Otherwise, the overidentification test statistics needed by estat overid, difference are not computed.

Immediately after the estimation results, xtdpdgmm now displays a list of instruments used in the estimation. To hide this list, specify the new option nofootnote.

Here is an example:

Code:

. webuse psidextract

. xtdpdgmm L(0/1).lwage wks union fem ed blk, gmm(L.lwage, l(1 4) c m(d)) iv(wks union, d m(d)) iv(fem ed blk, m(l)) two vce(r) overid

Generalized method of moments estimation

Fitting full model:

Step 1         f(b) =  .00162045
Step 2         f(b) =  .03866292

Fitting reduced model 2:

Step 1         f(b) =  .00291706

Fitting reduced model 3:

Step 1         f(b) =  .03866292

Group variable: id                           Number of obs         =      3570
Time variable: t                             Number of groups      =       595

Moment conditions:     linear =      10      Obs per group:    min =         6
                    nonlinear =       0                        avg =         6
                        total =      10                        max =         6

                                   (Std. Err. adjusted for 595 clusters in id)
------------------------------------------------------------------------------
             |              WC-Robust
       lwage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       lwage |
         L1. |   .9915635    .017254    57.47   0.000     .9577463    1.025381
             |
         wks |   .0024652   .0023225     1.06   0.288    -.0020869    .0070173
       union |   .0396529   .0302294     1.31   0.190    -.0195956    .0989014
         fem |   .0022083   .0118778     0.19   0.853    -.0210718    .0254885
          ed |   .0043581   .0018224     2.39   0.017     .0007863    .0079299
         blk |    -.01385   .0092622    -1.50   0.135    -.0320035    .0043035
       _cons |   -.034514   .1624417    -0.21   0.832    -.3528939    .2838658
------------------------------------------------------------------------------
Instruments corresponding to the linear moment conditions:
 1, model(diff):
   L1.L.lwage L2.L.lwage L3.L.lwage L4.L.lwage
 2, model(diff):
   D.wks D.union
 3, model(level):
   fem ed blk
 4, model(level):
   _cons

. estat overid

Sargan-Hansen test of the overidentifying restrictions
H0: overidentifying restrictions are valid

2-step moment functions, 2-step weighting matrix       chi2(3)     =   23.0044
                                                       Prob > chi2 =    0.0000

2-step moment functions, 3-step weighting matrix       chi2(3)     =   22.7594
                                                       Prob > chi2 =    0.0000

. estat overid, difference

Sargan-Hansen (difference) test of the overidentifying restrictions
H0: (additional) overidentifying restrictions are valid

2-step weighting matrix from full model

                  | Excluding                   | Difference                  
Moment conditions |       chi2     df         p |        chi2     df         p
------------------+-----------------------------+-----------------------------
   1, model(diff) |          .     -1         . |           .      .         .
   2, model(diff) |     1.7356      1    0.1877 |     21.2688      2    0.0000
  3, model(level) |    23.0044      3    0.0000 |     -0.0000      0         .
      model(diff) |          .     -3         . |           .      .         .

estat overid without any option displays the Sargan-Hansen test for the full model, as in previous versions. estat overid with option difference shows the Sargan-Hansen tests for the reduced models when a subset of the moment conditions is excluded and the corresponding difference tests to the full model. The first column indicates the excluded group of instruments. The labels correspond to those listed directly below the regression output. The results can be read as follows:

When excluding the first set of instruments (in this case the GMM-type instruments for the first-differenced model), the model is no longer identified. There is 1 instrument too few compared to the number of parameters. Hence, no test statistics are computed.
When excluding the second set of instruments (the standard instruments for the first-differenced model), the reduced model still has 1 overidentifying restriction and the Sargan-Hansen test does not reject its validity (p-value 0.1877). The Sargan-Hansen difference test rejects the validity of the 2 additional moment restrictions imposed by these instruments (p-value 0.0000). In this example, this could be an indication that the implicit assumption of strict exogeneity for the variables wks and union is not justified.
When excluding the third set of instruments (the standard instruments for the level model), the Sargan-Hansen test rejects the validity of the remaining 3 overidentifying restrictions (p-value 0.0000). This is not surprising given the the conclusion from the previous test. Notice that the test statistic for the reduced model (23.0044) and the degrees of freedom (3) are identical to the test statistic for the full model (as reported by estat overid without option difference) and therefore the difference statistic is exactly zero, despite the fact that we removed 3 instruments from the model. This is not a mistake. The reason is that after removing these 3 instruments the coefficients of the 3 time-invariant regressors fem, ed, and blk are no longer identified. These 3 instruments are essential for their identification and therefore do not create overidentifying restrictions.
The final row considers a model in which all instruments for the first-differenced model are excluded (i.e. both the first and the second set of instruments). Again, the reduced model is underidentified. There are 3 instruments less than coefficients.

I have chosen this example on purpose because xtabond2 does not recognize the identification problem when excluding the instruments for the time-invariant regressors:

Code:

. xtabond2 L(0/1).lwage wks union fem ed blk, gmm(L.lwage, l(1 4) c eq(d)) iv(wks union, eq(d)) iv(fem ed blk, eq(l)) two r
Favoring speed over space. To switch, type or click on mata: mata set matafavor space, perm.

Dynamic panel-data estimation, two-step system GMM
------------------------------------------------------------------------------
Group variable: id                              Number of obs      =      3570
Time variable : t                               Number of groups   =       595
Number of instruments = 10                      Obs per group: min =         6
Wald chi2(6)  =  15253.47                                      avg =      6.00
Prob > chi2   =     0.000                                      max =         6
------------------------------------------------------------------------------
             |              Corrected
       lwage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       lwage |
         L1. |   .9915635    .017254    57.47   0.000     .9577463    1.025381
             |
         wks |   .0024652   .0023225     1.06   0.288    -.0020869    .0070173
       union |   .0396529   .0302294     1.31   0.190    -.0195956    .0989014
         fem |   .0022083   .0118778     0.19   0.853    -.0210718    .0254885
          ed |   .0043581   .0018224     2.39   0.017     .0007863    .0079299
         blk |    -.01385   .0092622    -1.50   0.135    -.0320035    .0043035
       _cons |   -.034514   .1624417    -0.21   0.832    -.3528939    .2838658
------------------------------------------------------------------------------
Instruments for first differences equation
  Standard
    D.(wks union)
  GMM-type (missing=0, separate instruments for each period unless collapsed)
    L(1/4).L.lwage collapsed
Instruments for levels equation
  Standard
    fem ed blk
    _cons
------------------------------------------------------------------------------
Arellano-Bond test for AR(1) in first differences: z =  -4.74  Pr > z =  0.000
Arellano-Bond test for AR(2) in first differences: z =   2.53  Pr > z =  0.011
------------------------------------------------------------------------------
Sargan test of overid. restrictions: chi2(3)    =  21.00  Prob > chi2 =  0.000
  (Not robust, but not weakened by many instruments.)
Hansen test of overid. restrictions: chi2(3)    =  23.00  Prob > chi2 =  0.000
  (Robust, but weakened by many instruments.)

Difference-in-Hansen tests of exogeneity of instrument subsets:
  iv(wks union, eq(diff))
    Hansen test excluding group:     chi2(1)    =   1.74  Prob > chi2 =  0.188
    Difference (null H = exogenous): chi2(2)    =  21.27  Prob > chi2 =  0.000
  iv(fem ed blk, eq(level))
    Hansen test excluding group:     chi2(0)    =  23.00  Prob > chi2 =      .
    Difference (null H = exogenous): chi2(3)    =   0.00  Prob > chi2 =  1.000

The test statistics are identical but the degrees of freedom for the final "Hansen text excluding group" and the corresponding difference test are incorrect.

Further discussion about GMM estimation and overidentification tests in the presence of time-invariant regressors can be found in the following paper:

Kripfganz, S., and C. Schwarz (2019). Estimation of linear dynamic panel data models with time-invariant regressors. Journal of Applied Econometrics 34 (4), 526-546

Last edited by Sebastian Kripfganz; 19 Jun 2019, 14:06.

https://www.kripfganz.de/stata/

Comment

Sebastian Kripfganz

Join Date: May 2014

Posts: 2588
#45

20 Jul 2019, 13:18

An update to version 2.1.1 of the xtdpdgmm package is now available on my website.

As a new feature, it now provides the option small to apply a small-sample degrees-of-freedom correction to the standard errors. The correction factor is the same that is used elsewhere in the Stata universe.

This version also fixes a bug when an if-condition was specified in the xtdpdgmm command line. On that front, notice that the behavior of xtdpdgmm differs from xtabond2 or xtdpd. When you restrict the sample, say by specifying if year >= 1978, then xtdpdgmm effectively also excludes 1978 from the first-differenced model. The reason is that the first-differenced dependent variable for 1978 is a function of the level dependent variable for 1978 and 1977, but the latter is excluded by the if-condition. xtabond2 and xtdpd, however, use the first-differenced model for the initial year 1978 in their calculations.

Also notice that restricting the sample with an if-condition in the command line is generally not equivalent to reducing the sample with the keep or drop command. In the first case, lags of the variables are formed before the if-condition is applied. In the latter case, the observations to form these lags are no longer available. This is true for xtdpdgmm, xtabond2, and xtdpd. Both ways of restricting the sample could be reasonable in applied work and you need to carefully think about which information shall be available for the estimation to decide on the appropriate approach.

https://www.kripfganz.de/stata/
1 like
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment