XTDPDGMM: new Stata command for efficient GMM estimation of linear (dynamic) panel models with nonlinear moment conditions

Sebastian Kripfganz

Join Date: May 2014

Posts: 2588
#16

09 Dec 2017, 10:30

You can use the estat overid postestimation command directly after running your xtdpdgmm regression. This will provide you with Hansen's J test for all overidentifying restrictions. If you wish to obtain difference-in-Hansen tests, you will have to run regressions for the restricted and unrestricted models, store the estimation results from the first regression under a name and then use estat overid name after the second regression.

As an aside: You are using the option gmmiv(varlist, difference) which creates first differences of the instruments for the level equation. If that is what you intend, please just ignore this comment. If instead you aim to obtain instruments for the first-differenced equation, the correct syntax would be gmmiv(varlist, model(difference)).

https://www.kripfganz.de/stata/
Comment

Albert Ejiro

Join Date: Jun 2017
Posts: 7

#17

09 Dec 2017, 15:19

Dear Sebastian, thank you for the observation and I have effected the changes as follow:

Code:

xtdpdgmm roa l.roa indp cduality bdiversity1 lbsize acindp acfexpr ncindp ccindp lshare  size leverage rdsales cexp netsalesgrw  lage d2006-d2015 d3-d8 , noserial gmmiv(roa indp cduality bdiversity1 lbsize acindp acfexpr ncindp ccindp lshare  size leverage rdsales cexp netsalesgrw ,lag(3 4) model(difference)collapse) iv(d2006-d2015 d3-d8  lage) twostep vce(robust)

Comment

Albert Ejiro

Join Date: Jun 2017

Posts: 7
#18

13 Dec 2017, 11:54

Dear Sebastian, Please what is the right syntax for the difference-in-Hansen tests for the restricted and unrestricted models. I have gone through the stata manual by Christopher F Baum but I still cannot find the right command to use. For instance I do not know if this should be the right way for the restricted model

Code:

ivregress 2sls roa size leverage rdsales lage cexp netsalesgrw i.year i.country i.siccode (indp cduality bdiversity1 lbsize acindp acfexpr ccindp ncindp lshare = l.indp l.cduality l.bdiversity1 l.lbsize l.acindp l.acfexpr l.ccindp l.ncindp l.lshare)

Code:

predict u2, resid
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2588
#19

14 Dec 2017, 03:16

If you want to use the postestimation command estat overid after xtdpdgmm, you need to estimate both models with xtdpdgmm. You then need to decide which subset of instruments you want to exclude for the testing purposes. For example, if you want to test the validity of the overidentifying restrictions implied by the nonlinear moment conditions, you would run the estimation once with and once without the noserial option, store the first estimation results under a name and then run estat overid name after the second estimation. Here is an example:

Code:

. webuse abdata . xtdpdgmm L(0/1).n w k, gmmiv(L.n, c m(d)) iv(w k, d m(d)) twostep vce(robust) noserial . estimates store gmm1 . xtdpdgmm L(0/1).n w k, gmmiv(L.n, c m(d)) iv(w k, d m(d)) twostep vce(robust) . estat overid gmm1

https://www.kripfganz.de/stata/
Comment

Albert Ejiro

Join Date: Jun 2017
Posts: 7

#20

15 Dec 2017, 08:25

Dear Sebastian, thank you i now understand it better. I have another question as regards how many lags I have to specify for the dynamic completeness of my model. For instance I ran regress

Code:

xi: reg roa l.roa l2.roa l3.roa l4.roa  size leverage rdsales lage cexp netsalesgrw i.year i.country, cluster (id) robust
i.year            _Iyear_2004-2015    (naturally coded; _Iyear_2004 omitted)
i.country         _Icountry_1-8       (naturally coded; _Icountry_1 omitted)
note: _Iyear_2005 omitted because of collinearity
note: _Iyear_2006 omitted because of collinearity
note: _Iyear_2007 omitted because of collinearity
note: _Iyear_2015 omitted because of collinearity

Linear regression                               Number of obs     =      4,311
                                                F(24, 662)        =     103.29
                                                Prob > F          =     0.0000
                                                R-squared         =     0.4832
                                                Root MSE          =     5.4524

                                   (Std. Err. adjusted for 663 clusters in id)
------------------------------------------------------------------------------
             |               Robust
         roa |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         roa |
         L1. |    .461267   .0291797    15.81   0.000     .4039711     .518563
         L2. |   .0822153   .0234362     3.51   0.000     .0361971    .1282335
         L3. |   .0609813   .0211912     2.88   0.004     .0193712    .1025914
         L4. |   .0490574   .0211257     2.32   0.021     .0075759    .0905389
             |
        size |   .4479199   .0790612     5.67   0.000     .2926789    .6031609
    leverage |  -.0328898   .0056083    -5.86   0.000     -.043902   -.0218777
     rdsales |  -.0598592   .0154392    -3.88   0.000     -.090175   -.0295435
        lage |   .0957258    .172054     0.56   0.578    -.2421115    .4335631
        cexp |  -.0063025   .0258475    -0.24   0.807    -.0570554    .0444504
 netsalesgrw |   .1076629   .0097653    11.03   0.000     .0884882    .1268375
 _Iyear_2005 |          0  (omitted)
 _Iyear_2006 |          0  (omitted)
 _Iyear_2007 |          0  (omitted)
 _Iyear_2008 |  -1.436325   .4090706    -3.51   0.000    -2.239557   -.6330932
 _Iyear_2009 |   .4156977   .3655749     1.14   0.256    -.3021283    1.133524
 _Iyear_2010 |   1.649389   .3221515     5.12   0.000     1.016828    2.281951
 _Iyear_2011 |   .6066941   .3311038     1.83   0.067    -.0434461    1.256834
 _Iyear_2012 |   .7450445   .3074329     2.42   0.016     .1413834    1.348706
 _Iyear_2013 |   .8284794   .3015524     2.75   0.006      .236365    1.420594
 _Iyear_2014 |   .9488653   .2828824     3.35   0.001     .3934104     1.50432
 _Iyear_2015 |          0  (omitted)
 _Icountry_2 |  -.1262503    .296474    -0.43   0.670    -.7083931    .4558925
 _Icountry_3 |  -1.051163   .3102917    -3.39   0.001    -1.660438   -.4418889
 _Icountry_4 |  -.1288008   .5198947    -0.25   0.804    -1.149642    .8920404
 _Icountry_5 |  -1.456031   .2511707    -5.80   0.000    -1.949218   -.9628439
 _Icountry_6 |  -1.326935   .4667966    -2.84   0.005    -2.243515   -.4103542
 _Icountry_7 |    -.11488    1.02714    -0.11   0.911    -2.131724    1.901964
 _Icountry_8 |  -.5622005   .3524436    -1.60   0.111    -1.254243    .1298416
       _cons |  -1.615194   .9655349    -1.67   0.095    -3.511074    .2806858

my question is does that mean I have to specify dynamic completeness using the first four lags of dependent variable and run the xtdpdgmm as

Code:

 
 xtdpdgmm l(0/4) indp cduality bdiversity1 lbsize acindp acfexpr ncindp ccindp lshare  size leverage rdsales cexp netsalesgrw  lage d2006-d2015 d3-d8 , noserial gmmiv(roa indp cduality bdiversity1 lbsize acindp acfexpr ncindp ccindp lshare  size leverage rdsales cexp netsalesgrw ,lag(4 .) model(difference)collapse) iv(d2006-d2015 d3-d8  lage) twostep vce(robust)

Comment

Sebastian Kripfganz

Join Date: May 2014

Posts: 2588
#21

15 Dec 2017, 11:08

What you want to achieve is that the errors of your regression are serially uncorrelated. After running an xtdpdgmm regression, you can use the Arellano-Bond test for serial correlation in the first-differenced residuals. If you are rejecting the null hypothesis of no serial correlation for order 2 or higher, this indicates that your model is not dynamically complete. This test is obtained with the postestimation command estat serial. For details, please see

Code:

help xtdpdgmm postestimation

https://www.kripfganz.de/stata/
Comment

Albert Ejiro

Join Date: Jun 2017
Posts: 7

#22

15 Dec 2017, 11:38

Yes I understand what the estat serial does. In terms of the dynamic completeness I am asking, I am referring to the seminar work of Wintoki et al (2012) : Endogeneity and the dynamics of internal corporate governance. http://www.sciencedirect.com/science...04405X12000426

on page: 593 section 5.1. They made mention of how many lags of performance are needed to ensure dynamic completeness of the model. where they ran regress lags of performance on the control variables and determined that for their case 2 lags of performance will be relevant and they showed in a Table 4 on page 594. They therefore estimated their model using :

Code:

Xi: xtabond2 roa l.roa l2.roa..........etc

--- 2 lags of performance to control for dynamic completness and in the gmmstyle they use lag(3 4)

My question from before is should I include four lags on my model as well for dynamic completeness in my model because from my initial table, lag four is still significant. It was only on lag 5 that Roa became insignificant

.

Code:

 xi: reg roa l.roa l2.roa l3.roa l4.roa l5.roa  size leverage rdsales lage cexp netsalesgrw i.year i.country, cluster (id) robust
i.year            _Iyear_2004-2015    (naturally coded; _Iyear_2004 omitted)
i.country         _Icountry_1-8       (naturally coded; _Icountry_1 omitted)
note: _Iyear_2005 omitted because of collinearity
note: _Iyear_2006 omitted because of collinearity
note: _Iyear_2007 omitted because of collinearity
note: _Iyear_2008 omitted because of collinearity
note: _Iyear_2013 omitted because of collinearity

Linear regression                               Number of obs     =      3,703
                                                F(24, 656)        =      77.11
                                                Prob > F          =     0.0000
                                                R-squared         =     0.5086
                                                Root MSE          =     5.0871

                                   (Std. Err. adjusted for 657 clusters in id)
------------------------------------------------------------------------------
             |               Robust
         roa |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         roa |
         L1. |   .4559014   .0290334    15.70   0.000     .3988918    .5129111
         L2. |   .0741202   .0232036     3.19   0.001     .0285578    .1196825
         L3. |    .067437   .0202817     3.33   0.001     .0276122    .1072618
         L4. |   .0504379   .0196522     2.57   0.010     .0118491    .0890267
         L5. |   .0148837   .0184981     0.80   0.421    -.0214389    .0512062
             |
        size |   .3857348   .0817339     4.72   0.000     .2252431    .5462265
    leverage |  -.0258838   .0059378    -4.36   0.000    -.0375431   -.0142244
     rdsales |  -.0461942   .0176924    -2.61   0.009    -.0809347   -.0114537
        lage |   .1694536   .1843043     0.92   0.358     -.192444    .5313511
        cexp |  -.0155643   .0285172    -0.55   0.585    -.0715603    .0404318
 netsalesgrw |   .1058518   .0102332    10.34   0.000      .085758    .1259456
 _Iyear_2005 |          0  (omitted)
 _Iyear_2006 |          0  (omitted)
 _Iyear_2007 |          0  (omitted)
 _Iyear_2008 |          0  (omitted)
 _Iyear_2009 |   -.480308   .3501509    -1.37   0.171     -1.16786    .2072436
 _Iyear_2010 |   .7967455   .3160692     2.52   0.012     .1761161    1.417375
 _Iyear_2011 |  -.2369184   .3192528    -0.74   0.458    -.8637989    .3899621
 _Iyear_2012 |  -.0985054   .3259502    -0.30   0.763    -.7385369    .5415262
 _Iyear_2013 |          0  (omitted)
 _Iyear_2014 |   .1176404   .3000777     0.39   0.695    -.4715881    .7068689
 _Iyear_2015 |  -.8743362   .2996419    -2.92   0.004    -1.462709   -.2859633
 _Icountry_2 |  -.3152732   .2915884    -1.08   0.280    -.8878324    .2572859
 _Icountry_3 |  -1.467921   .3030947    -4.84   0.000    -2.063074   -.8727683
 _Icountry_4 |   -.378966   .5387304    -0.70   0.482     -1.43681     .678878
 _Icountry_5 |  -1.565008   .2970455    -5.27   0.000    -2.148282   -.9817331
 _Icountry_6 |  -1.251977   .4520147    -2.77   0.006    -2.139547   -.3644069
 _Icountry_7 |  -.0683118   1.143688    -0.06   0.952    -2.314042    2.177418
 _Icountry_8 |  -.3626713   .4779429    -0.76   0.448    -1.301154    .5758111
       _cons |  -.7178755   .9767178    -0.73   0.463    -2.635746    1.199995
------------------------------------------------------------------------------

.

So should my model be

Code:

 
 xtdpdgmm l(0/4).roa indp cduality bdiversity1 lbsize acindp acfexpr ncindp ccindp lshare  size leverage rdsales cexp netsalesgrw  lage d2006-d2015 d3-d8 , noserial gmmiv(roa indp cduality bdiversity1 lbsize acindp acfexpr ncindp ccindp lshare  size leverage rdsales cexp netsalesgrw ,lag(5 .) model(difference)collapse) iv(d2006-d2015 d3-d8  lage) twostep vce(robust)

Code:

 
 xtdpdgmm roa l.roa l2.roa l3.roa l4.roa indp cduality bdiversity1 lbsize acindp acfexpr ncindp ccindp lshare  size leverage rdsales cexp netsalesgrw  lage d2006-d2015 d3-d8 , noserial gmmiv(roa indp cduality bdiversity1 lbsize acindp acfexpr ncindp ccindp lshare  size leverage rdsales cexp netsalesgrw ,lag(5 .) model(difference)collapse) iv(d2006-d2015 d3-d8  lage) twostep vce(robust)

Comment

Sebastian Kripfganz

Join Date: May 2014

Posts: 2588
#23

16 Dec 2017, 03:07

If your fourth lag in the GMM specification is statistically significant, then you probably want to keep it. This consideration is related to testing for serial correlation of the errors, because dropping such a significant lag is likely to create serially correlated errors.

https://www.kripfganz.de/stata/
Comment
Michael Reuter

Join Date: Sep 2014

Posts: 23
#24

26 Feb 2018, 06:11

Hi,
I am trying to translate the syntax of an xtabond2 call into the syntax for xtdpdgmm. This is what I have for xtabond2:

Code:

xtabond2 n L.n L(0/1).(w k) yr*, gmm(L.(n w k)) iv(yr*, equation(level)) robust orthogonal

This is my best attempt at xtdpdgmm at the moment:

Code:

xtdpdgmm n L.n L(0/1).(w k) yr*, gmmiv(L.(n w k)) iv(yr*, equation(level)) vce(robust)

What I am not sure is how I convey the desire for forward-orthogonal-differing (option "orthogonal") and the use of year dummies only in the level equation ("equation(level)") to xtdpdgmm. Do you have any advice?

Thank you!
Michael
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2588
#25

26 Feb 2018, 06:51

Regarding your first question: It is not possible (to date) to request forward-orthogonal deviations with xtdpdgmm.

To answer your second question: To specify instruments for the level equation/model with xtdpdgmm, you can use the suboption model(level):

Code:

xtdpdgmm n L.n L(0/1).(w k) yr*, gmmiv(L.(n w k)) iv(yr*, model(level)) vce(robust)

Note that this is actually the default. Thus, your GMM-type instruments are also specified only for the model in levels (and these instruments are not automatically transformed into first differences). To obtain GMM-type instruments both for the model in levels and in first differences, you need to specify them separately as follows (and you also need to specify explicitly that you want to transform the instruments for the level equation with the difference suboption and that you want to restrict the lag length with the suboption lagrange()):

Code:

xtdpdgmm n L.n L(0/1).(w k) yr*, gmmiv(L.(n w k), model(difference)) gmmiv(L.(n w k), difference lagrange(0 0) model(level)) iv(yr*, model(level)) vce(robust)

Unfortunately, if you run this specification you will notice that the opimization procedure is not converging. The problem is that the time dummies in a dynamic model create a perfect collinearity that is not flagged in the current version of the program. For the time being, you would need to specify the time dummies separately (excluding the first two of them to avoid the collinearity problem):

Code:

xtdpdgmm n L.n L(0/1).(w k) yr1978-yr1984, gmmiv(L.(n w k), model(difference)) gmmiv(L.(n w k), difference lagrange(0 0) model(level)) iv(yr1978-yr1984, model(level)) vce(robust)

(If you do not need the nonlinear moment conditions, the problem with the time dummies can be avoided with my program xtseqreg. The syntax is the same compared to xtdpdgmm and it even allows a teffects option to automatically add the correct number of time dummies with the correct instruments.)

A further warning: If you use xtabond2 with time dummies and some of them are omitted, then the reported p-values of the overidentification tests are invalid because xtabond2 computes the degrees of freedoms incorrectly in this situation. You can obtain correct degrees of freedoms with the postestimation command estat overid after either xtdpdgmm or xtseqreg. (Further note that the overidentification tests after the onestep estimator are computed differently in xtabond2 compared to xtdpdgmm / xtseqreg which explains the small differences in the value of the test statistic. After twostep estimation, the test statistics are the same.)

https://www.kripfganz.de/stata/
Comment

Sebastian Kripfganz

Join Date: May 2014
Posts: 2588

#26

04 May 2018, 15:57

An update to version 1.0.2 of the xtdpdgmm command is available on my website. This update has the following new functionality:

The new option teffects automatically adds the correct number of time dummies and the respective instruments to the model. This avoids the collinearity / convergence problems that have been reported earlier if the user specified too many time dummies manually.
The new postestimation command estat mmsc reports Andrews and Lu (2001) model and moment selection criteria. These can be used, inter alia, to decide on the "optimal" number of lags for the GMM-type instruments.

Here is an example with time effects and where I use the estat mmsc command to compare two models with 4 and 5 lags of the instruments for the lagged dependent variable, respectively:

Code:

. webuse abdata

. xtdpdgmm L(0/1).n w k, teffects gmmiv(L.n, c lag(1 4) m(d)) iv(w k, d m(d)) twostep vce(robust) nolog

Generalized method of moments estimation

Group variable: id                           Number of obs         =       891
Time variable: year                          Number of groups      =       140

Moment conditions:     linear =      14      Obs per group:    min =         6
                    nonlinear =       0                        avg =  6.364286
                        total =      14                        max =         8

                                     (Std. Err. adjusted for clustering on id)
------------------------------------------------------------------------------
             |              WC-Robust
           n |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
           n |
         L1. |   .4629979    .301439     1.54   0.125    -.1278116    1.053807
             |
           w |  -.3536815    .258456    -1.37   0.171     -.860246    .1528829
           k |   .2875594   .0816562     3.52   0.000     .1275162    .4476026
             |
        year |
       1978  |  -.0408135   .0230684    -1.77   0.077    -.0860268    .0043998
       1979  |  -.0474038   .0236161    -2.01   0.045    -.0936906    -.001117
       1980  |  -.0794993   .0217309    -3.66   0.000    -.1220912   -.0369074
       1981  |  -.1353418   .0308362    -4.39   0.000    -.1957797   -.0749038
       1982  |  -.1333819   .0610986    -2.18   0.029     -.253133   -.0136309
       1983  |  -.1528373   .1059261    -1.44   0.149    -.3604486     .054774
       1984  |  -.2980492   .2118981    -1.41   0.160    -.7133618    .1172634
             |
       _cons |   1.870866   .6417907     2.92   0.004     .6129798    3.128753
------------------------------------------------------------------------------

. estimates store gmm1

. xtdpdgmm L(0/1).n w k, teffects gmmiv(L.n, c lag(1 5) m(d)) iv(w k, d m(d)) twostep vce(robust) nolog

Generalized method of moments estimation

Group variable: id                           Number of obs         =       891
Time variable: year                          Number of groups      =       140

Moment conditions:     linear =      15      Obs per group:    min =         6
                    nonlinear =       0                        avg =  6.364286
                        total =      15                        max =         8

                                     (Std. Err. adjusted for clustering on id)
------------------------------------------------------------------------------
             |              WC-Robust
           n |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
           n |
         L1. |   .7321746   .2165172     3.38   0.001     .3078087     1.15654
             |
           w |  -.5162636   .2129929    -2.42   0.015    -.9337219   -.0988053
           k |   .2273219   .0716357     3.17   0.002     .0869185    .3677254
             |
        year |
       1978  |  -.0249589   .0255414    -0.98   0.328    -.0750191    .0251013
       1979  |  -.0330149   .0261475    -1.26   0.207     -.084263    .0182332
       1980  |  -.0703346    .024821    -2.83   0.005    -.1189829   -.0216863
       1981  |  -.1198316   .0334661    -3.58   0.000    -.1854241   -.0542392
       1982  |  -.0818496   .0521411    -1.57   0.116    -.1840443     .020345
       1983  |  -.0551394   .0831114    -0.66   0.507    -.2180347    .1077559
       1984  |  -.0994415   .1432455    -0.69   0.488    -.3801976    .1813147
             |
       _cons |   2.032031   .5757399     3.53   0.000     .9036018    3.160461
------------------------------------------------------------------------------

. estat mmsc gmm1

Andrews-Lu model and moment selection criteria

       Model | ngroups          J  nmom  npar   MMSC-AIC   MMSC-BIC  MMSC-HQIC
-------------+----------------------------------------------------------------
           . |     140     3.6694    15    11    -4.3306   -16.0972    -9.2400
        gmm1 |     140     1.6436    14    11    -4.3564   -13.1813    -8.0385

The output of the estat mmsc command is similar to the familiar estat ic command. The "optimal" model is the one with the lowest value of the respective criterion. Here, the MMSC-AIC (Akaike) criterion suggests the first model with 4 lags, while the MMSC-BIC (Bayesian) and the MMSC-HQIC (Hannan-Quinn) recommend the second model with 5 lags (listed first in the table in the row starting with the dot).

ngroups denotes the number of groups (which should be the same across models for a meaningful comparison), nmom the number of moment conditions, npar the number of parameters, and J the Hansen J-statistic.

Specifying just estat mmsc without the estimation name would just display the criteria for the immediately preceeding estimation results. More than 2 models could be compared by simply adding further names of stored estimation results.

Reference:

Andrews, D. W. K., and B. Lu. (2001). Consistent model and moment selection procedures for GMM estimation with application to dynamic panel data models. Journal of Econometrics 101, 123-164.

https://www.kripfganz.de/stata/

Comment

Sebastian Kripfganz

Join Date: May 2014

Posts: 2588
#27

03 Sep 2018, 04:02

A substantially updated version of xtdpdgmm is now available on SSC and my personal website:

Code:

adoupdate xtdpdgmm, update

The new version 1.1.0 has the following major improvements:
Forward-orthogonal deviations are now supported. You can specify GMM-style or standard instruments for the model transformed with forward-orthogonal deviations by using the new suboption model(fodev). This transformation has been proposed by Arellano and Bover (1995) in the context of dynamic panel data models. In combination with this transformation, backward-orthogonal deviations of the instruments can be specified with the new suboption bodef. This has been suggested by Hayakawa (2009).

Similarly, deviations from within-group means are now supported. The respective new suboption is model(mdev).

The nonlinear (and linear) moment conditions suggested by Ahn and Schmidt (1995) under absence of serial correlation and homoskedasticity are now implemented. They can be specified with the new option iid (in contrast to the already existing option noserial that does not assume homoskedasticity).

The nonlinear moment conditions can now be collapsed into a single moment condition (similar to the collapsing of GMM-style instruments) with the new option collapse. This global option implies a collapsing of the GMM-style instruments as well unless the new suboption nocollapse is specified for the latter.

Cluster-robust standard errors at a different level than the panel identifier are now supported with the new option vce(cluster ...).

The lagrange(# #) suboption for the specification of lags of the instruments is now also available for the iv() option. The default lag specification depends on the model transformation and the instrument style. See the help file for details.

If all moment conditions are linear, it is now possible to speed up the estimation by using the analytical solutions with the new option analytic, instead of minimizing the GMM criterion function numerically.

Here is an example command line for an Arellano and Bover (1995) GMM estimation with forward-orthogonal deviations:

Code:

. webuse abdata . xtdpdgmm L(0/1).n w k, gmmiv(L.n w k, lagrange(1 4) collapse model(fodev)) twostep vce(robust) analytic

The Hayakawa (2009) IV estimator could be implemented as follows:

Code:

. xtdpdgmm L(0/1).n w k, iv(L.n w k, bodev m(fodev)) vce(robust) analytic

Under the assumption of serially uncorrelated and homoskedastic errors, the Ahn and Schmidt (1995) GMM estimator with collapsed moment conditions is obtained in the following way:

Code:

. xtdpdgmm L(0/1).n w k, gmmiv(L.n w k, lagrange(1 4) model(difference)) iid collapse twostep vce(robust)

Further examples can be found in the help files.

I shall make some important comments about the replicability of the xtdpdgmm results with the community-contributed xtabond2 command. The latter has an option for forward-orthogonal deviations (and backward-orthogonal deviations of GMM-style instruments) as well. In a simple case, the following two specifications yield identical results:

Code:

. xtdpdgmm L(0/1).n w k, gmmiv(L.n w k, bodev model(fodev)) twostep vce(robust) . xtabond2 L(0/1).n w k, gmm(L.n w k, orthogonal eq(diff)) orthogonal twostep robust

Notice that the xtdpdgmm default of the lagrange(# #) option for GMM-style instruments with forward-orthogonal deviations is lagrange(0 .), while it is laglimits(1 .) for xtabond2 (where the dot means infinity). Despite the different initial lag, the results are the same. The reason can be found in the xtabond2 help file:

the software stores the orthogonal deviation of an observation one period late, so that, as with differencing, observations for period 1 are missing and, for an instrumenting variable w, w_i,t-1 enters the formula for the transformed observation stored at i,t. With this move, exactly the same lags of variables are valid as instruments under the two transformations.

This leads to a mismatch of time periods between the instruments and the dependent / independent variables. It works fine as long as we are using the default lag specifications as above. Suppose we want to limit the lags of the instruments to range from 0 to 2. (As opposed to first differencing, the contemporaneous lag of a predetermined variable remains a valid instrument under forward-orthogonal deviations.) With xtdpdgmm, this is achieved in a straightforward way by specifying the suboption lagrange(0 2). With xtabond2, as a consequence of the above quote from the help file, the necessary option would be laglimits(1 3) to achieve the same results:

Code:

. xtdpdgmm L(0/1).n w k, gmmiv(L.n w k, bodev lagrange(0 2) model(fodev)) twostep vce(robust) . xtabond2 L(0/1).n w k, gmm(L.n w k, orthogonal laglimits(1 3) eq(diff)) orthogonal twostep robust

While this follows in principle from the documentation in the xtabond2 help file, it is not very intuitive and I would not be surprised if many users intuitively but incorrectly specified laglimits(0 2). This would actually start with the first lead of the variable, which is an invalid instrument! Here is another example with standard instruments:

Code:

. xtdpdgmm L(0/1).n w k, iv(L.n w k, model(fodev)) vce(robust) . xtabond2 L(0/1).n w k, iv(L2.n L.w L.k, passthru eq(diff)) orthogonal robust

In order to achieve the correct instrument specification, with xtabond2 you would actually have to specify lags of the variables (i.e. the second lag of the lagged dependent variable) despite the fact that you want to use them in their contemporaneous form. (As an aside: xtabond2 does not allow for backward-orthogonal transformations of standard instruments while xtdpdgmm does; see the Hayakawa (2009) IV estimator mentioned earlier.)
To conclude: Be aware of this problem if you plan to use the xtabond2 command with forward-orthogonal deviations. The xtdpdgmm command provides an intuitive alternative following the principle: What you type is what you get!

You should also keep in mind that unlike with first differencing, taking lags of backward-orthogonal deviations is not the same as backward-orthogonal deviations of lags because of different sample size restrictions. The following specifications yield different results (equivalently if you replicate them with xtabond2):

Code:

. xtdpdgmm L(0/1).n w k, gmmiv(L.n w k, bodev lagrange(0 2) model(fodev)) twostep vce(robust) . xtdpdgmm L(0/1).n w k, gmmiv(L2.n L.w L.k, bodev lagrange(-1 1) model(fodev)) twostep vce(robust) . xtdpdgmm L(0/1).n w k, gmmiv(n F.w F.k, bodev lagrange(1 3) model(fodev)) twostep vce(robust)

While in principle all three specifications are valid, to implement the Hayakawa (2009) estimator you need to specify the first version, i.e. specify the variables in the instrument list the same way as they appear in the list of the independent variables.

References:
Ahn, S. C., and P. Schmidt (1995). Efficient estimation of models for dynamic panel data. Journal of Econometrics 68, 5-27.

Arellano, M., and O. Bover (1995). Another look at the instrumental variable estimation of error-components models. Journal of Econometrics 68, 29-51.

Hayakawa, K. (2009). A simple efficient instrumental variable estimator for panel AR(p) models when both N and T are large. Econometric Theory 25, 873-890.

Last edited by Sebastian Kripfganz; 03 Sep 2018, 04:13.

https://www.kripfganz.de/stata/
1 like
Comment

Sebastian Kripfganz

Join Date: May 2014
Posts: 2588

#28

09 Sep 2018, 11:40

Thanks to Kit Baum, another update to version 1.1.1 is now available on SSC (and on my personal website). This update substantially benefits from discussions with Mark Schaffer at the London Stata Conference this week.

The update fixes a bug regarding the default lag selection under some model specifications. There was no problem in previous versions if the lag range was explicitly specified with the lagrange() option (not calling for the default by using missing values).

In addition, the postestimation command predict now has the new option iv which generates the instruments used in the estimation (associated with linear moment conditions and excluding the constant term) as new Stata variables. All of these instruments are transformed in an appropriate way such that they become instruments for the model in levels. For example:

Code:

. webuse abdata

. xtdpdgmm L(0/1).n w k, gmmiv(L.n w k, lagrange(1 4) collapse model(difference)) iv(L.n w k, difference) twostep vce(robust) analytic

Generalized method of moments estimation

Group variable: id                           Number of obs         =       891
Time variable: year                          Number of groups      =       140

Moment conditions:     linear =      16      Obs per group:    min =         6
                    nonlinear =       0                        avg =  6.364286
                        total =      16                        max =         8

                                   (Std. Err. adjusted for 140 clusters in id)
------------------------------------------------------------------------------
             |              WC-Robust
           n |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
           n |
         L1. |   .4695086   .1167909     4.02   0.000     .2406028    .6984145
             |
           w |  -1.287266   .2672931    -4.82   0.000    -1.811151   -.7633816
           k |   .2172426   .0913599     2.38   0.017     .0381805    .3963046
       _cons |   4.633068   .8721846     5.31   0.000     2.923618    6.342518
------------------------------------------------------------------------------

. quietly predict double iv*, iv

. xtdpdgmm L(0/1).n w k, iv(iv*) twostep vce(robust) analytic

Generalized method of moments estimation

Group variable: id                           Number of obs         =       891
Time variable: year                          Number of groups      =       140

Moment conditions:     linear =      16      Obs per group:    min =         6
                    nonlinear =       0                        avg =  6.364286
                        total =      16                        max =         8

                                   (Std. Err. adjusted for 140 clusters in id)
------------------------------------------------------------------------------
             |              WC-Robust
           n |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
           n |
         L1. |   .4695086   .1167909     4.02   0.000     .2406028    .6984145
             |
           w |  -1.287266   .2672931    -4.82   0.000    -1.811151   -.7633816
           k |   .2172426   .0913599     2.38   0.017     .0381805    .3963046
       _cons |   4.633068   .8721846     5.31   0.000     2.923618    6.342518
------------------------------------------------------------------------------

Notice that the initial estimation had GMM-type instruments specified for the first-differenced model and standard instruments for the level model, while the second estimation yields identical results when all the generated instruments are just specified as standard instruments for the level model. This equivalence generally requires that wmatrix(unadjusted) is used for the first-step weighting matrix, which is the default and usually recommended.

https://www.kripfganz.de/stata/

Comment

Nicu Sprincean

Join Date: Nov 2016

Posts: 47
#29

24 Sep 2018, 10:00

Dear Sebastian Kripfganz,

I want to run a system GMM, i.e., including the first lag (or deeper) of the dependent variable. This lag, or the dependent variable per se, has to appear mandatory in the gmmstyle specification? Also, if I do not include it into the gmm specification, the lag of it has to appear in the ivstyle specification?
Thank you.

Last edited by Nicu Sprincean; 24 Sep 2018, 10:09.
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2588
#30

24 Sep 2018, 10:13

From a technical and theoretical point of view, it is not mandatory to include the lagged dependent variable in either of the gmmiv() or the iv() specification. All you need is a sufficient number of instruments that are sufficiently correlated with the regressors. That said, in almost all empirical applications you actually would include the lagged dependent variable in the set of instruments. Examples are given earlier in this topic or in the xtdpdgmm help file. The command does not automatically create those instruments for you. You always need to specify them in the way you want to include them.

https://www.kripfganz.de/stata/
1 like
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment