Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Filip Novinc
    replied
    Dear professor Kripfganz,
    Thank you for all the provided information here. I have thoroughly read all the posts related to the xtdpdgmm thread and have thoroughly studied the dynamic panel data estimation for my doctoral disertation but still haven't figured out a couple of questions.
    I have a panel of N=500 firms from Croatian manufacturing industry (T=12) and am trying to estimate the impact of unit labour costs (lnulc1) on exports (lnexport3). Other regressors are lagged exports (l.lnxport3) material cost (lnumc), differenced fixed capital assets (dlnreal_K), workers employed (lnL) and intangible fixed assets (Inrealintangible_K). ln means the variable is in natural logarithm. I assume they are all endogenous – there might be unobserved firm characteristics such as quality of management or ownership that can affect behaviour of firms and their exports. I have several questions about my analysis and about xtdpdgmm in general:
    1. The correlation coefficient between (some) regressors and their differences is preety small - e.g. for the first available instrument as lnulc1 is treated as endogenous (this applies to longer lags aswell): Corr(d.lnulc1, l2.lnulc1) = -0,1385 ; Corr(lnulc1, d.lnulc1) = 0,1241. Could I be having a weak instrument problem even if diff-in-Hansen test is fine? Or since all instruments instrument all the regressors lead to better estimations so I don't have to worry about these correlations?
    2. If diff-in-Hansen test is not satisfying (p<0,1) for only one variable for example in levels, does that mean all the estimated coefficients are biased and inconsistent?
    3. Can you help please with diff-in-Hansen test intrepretation?
      Code:
      . xtdpdgmm lnexport3 l.lnexport3 lnulc1 lnL dlnreal_K lnumc l.lnrealintangible_K, ///
      	> model(diff) collapse gmm(lnexport3, lag(2 3)) ///
      	>                      gmm(lnulc1, lag(2 3)) ///
      	>                                          gmm(lnL, lag(2 3)) ///
      	>                                          gmm(dlnreal_K, lag(1 2)) ///
      	>                                          gmm(lnumc, lag(1 2)) ///
      	>                                          gmm(lnrealintangible_K, lag (1 2)) ///
      	>                                          ///
      	>                      gmm(lnexport3, lag(1 1) diff model(level)) ///
      	>                                          gmm(lnulc1, lag(1 1) diff model(level)) ///
      	>                                          gmm(lnL, lag(1 1) diff model(level)) ///
      	>                                          gmm(dlnreal_K, lag(0 0) diff model(level)) ///
      	>                                          gmm(lnumc, lag(0 0) diff model(level)) ///
      	>                      gmm(lnrealintangible_K, lag (0 0) diff model (level)) ///
      	> teffects twostep vce(robust, dc) small overid
      	
      	Generalized method of moments estimation
      	
      	Fitting full model:
      	Step 1         f(b) =  .01217351
      	Step 2         f(b) =  .02229234
      	
      	Fitting reduced model 1:
      	Step 1         f(b) =  .01825629
      	
      	Fitting reduced model 2:
      	Step 1         f(b) =  .01806326
      	
      	Fitting reduced model 3:
      	Step 1         f(b) =  .02210642
      	
      	Fitting reduced model 4:
      	Step 1         f(b) =  .02227238
      	
      	Fitting reduced model 5:
      	Step 1         f(b) =  .01835181
      	
      	Fitting reduced model 6:
      	Step 1         f(b) =  .01389141
      	
      	Fitting reduced model 7:
      	Step 1         f(b) =  .02192478
      	
      	Fitting reduced model 8:
      	Step 1         f(b) =  .02228061
      	
      	Fitting reduced model 9:
      	Step 1         f(b) =   .0193838
      	
      	Fitting reduced model 10:
      	Step 1         f(b) =  .02228957
      	
      	Fitting reduced model 11:
      	Step 1         f(b) =  .01851922
      	
      	Fitting reduced model 12:
      	Step 1         f(b) =  .02220404
      	
      	Fitting reduced model 13:
      	Step 1         f(b) =  .00270867
      	
      	Group variable: id                           Number of obs         =      4402
      	Time variable: year                          Number of groups      =       529
      	
      	Moment conditions:     linear =      29      Obs per group:    min =         1
      	                   nonlinear =       0                        avg =  8.321361
      	                       total =      29                        max =        11
      	
      	                                        (Std. Err. adjusted for 529 clusters in id)
      	------------------------------------------------------------------------------------
      	                  |              DC-Robust
      	        lnexport3 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      	-------------------+----------------------------------------------------------------
      	        lnexport3 |
      	              L1. |   .5724923   .0602836     9.50   0.000     .4540672    .6909175
      	                  |
      	           lnulc1 |  -.6139368   .1540915    -3.98   0.000    -.9166446   -.3112291
      	              lnL |   .7445906   .1702559     4.37   0.000     .4101284    1.079053
      	        dlnreal_K |   .0816588   .0346772     2.35   0.019     .0135365     .149781
      	            lnumc |  -.4469151   .1617449    -2.76   0.006    -.7646577   -.1291726
      	                  |
      	lnrealintangible_K |
      	              L1. |  -.0093538   .0119297    -0.78   0.433    -.0327893    .0140817
      	                  |
      	             year |
      	            2010  |   .0766462   .0537471     1.43   0.154    -.0289382    .1822306
      	            2011  |   .1398977   .0476322     2.94   0.003     .0463258    .2334697
      	            2012  |    .103861   .0426544     2.43   0.015     .0200678    .1876543
      	            2013  |   .1476749   .0466095     3.17   0.002      .056112    .2392378
      	            2014  |   .2201735   .0483601     4.55   0.000     .1251717    .3151754
      	            2015  |   .2367494   .0529298     4.47   0.000     .1327706    .3407282
      	            2016  |   .2454128   .0546268     4.49   0.000     .1381002    .3527254
      	            2017  |   .2708834   .0549217     4.93   0.000     .1629914    .3787753
      	            2018  |   .2706555   .0580685     4.66   0.000     .1565819    .3847292
      	            2019  |   .2205907   .0591024     3.73   0.000     .1044859    .3366955
      	                  |
      	            _cons |  -3.575295   .8959527    -3.99   0.000    -5.335364   -1.815225
      	------------------------------------------------------------------------------------
      	Instruments corresponding to the linear moment conditions:
      	1, model(diff):
      	  L2.lnexport3 L3.lnexport3
      	2, model(diff):
      	  L2.lnulc1 L3.lnulc1
      	3, model(diff):
      	  L2.lnL L3.lnL
      	4, model(diff):
      	  L1.dlnreal_K L2.dlnreal_K
      	5, model(diff):
      	  L1.lnumc L2.lnumc
      	6, model(diff):
      	  L1.lnrealintangible_K L2.lnrealintangible_K
      	7, model(level):
      	  L1.D.lnexport3
      	8, model(level):
      	  L1.D.lnulc1
      	9, model(level):
      	  L1.D.lnL
      	10, model(level):
      	  D.dlnreal_K
      	11, model(level):
      	  D.lnumc
      	12, model(level):
      	  D.lnrealintangible_K
      	13, model(level):
      	  2010bn.year 2011.year 2012.year 2013.year 2014.year 2015.year 2016.year
      	  2017.year 2018.year 2019.year
      	14, model(level):
      	  _cons
      	
      	.
      	end of do-file
      	
      	. estat overid, difference
      	
      	Sargan-Hansen (difference) test of the overidentifying restrictions
      	H0: (additional) overidentifying restrictions are valid
      	
      	2-step weighting matrix from full model
      	
      	                 | Excluding                   | Difference                  
      	Moment conditions |       chi2     df         p |        chi2     df         p
      	------------------+-----------------------------+-----------------------------
      	  1, model(diff) |     9.6576     10    0.4710 |      2.1351      2    0.3439
      	  2, model(diff) |     9.5555     10    0.4803 |      2.2372      2    0.3267
      	  3, model(diff) |    11.6943     10    0.3060 |      0.0984      2    0.9520
      	  4, model(diff) |    11.7821     10    0.2999 |      0.0106      2    0.9947
      	  5, model(diff) |     9.7081     10    0.4665 |      2.0845      2    0.3527
      	  6, model(diff) |     7.3486     10    0.6922 |      4.4441      2    0.1084
      	 7, model(level) |    11.5982     11    0.3946 |      0.1944      1    0.6592
      	 8, model(level) |    11.7864     11    0.3799 |      0.0062      1    0.9372
      	 9, model(level) |    10.2540     11    0.5077 |      1.5386      1    0.2148
      	10, model(level) |    11.7912     11    0.3795 |      0.0015      1    0.9695
      	11, model(level) |     9.7967     11    0.5488 |      1.9960      1    0.1577
      	12, model(level) |    11.7459     11    0.3830 |      0.0467      1    0.8289
      	13, model(level) |     1.4329      2    0.4885 |     10.3598     10    0.4095
      If I got it right, all the model(level) p-values (rows 7 - 13) under column Excluding have to be higher than (let's say) 0,1 to say that difference GMM estimator is fine and we may try the SYS GMM. Then we go to the column Difference where we look for both model(diff) and model(level) rows and all of them need to be fine in order to say that SYS-GMM is ok and we can proceed with the analysis. Is my understanding correct?
    4. Is there a way to extract results of Hansen, difference-in-Hansen, underid and AR(2) test from multiple estimations in the same table after applying xtdpdgmm?
    5. When I do continuously updated GMM with the following code the error is returned. Could you please help? I am running Stata 14.0 and the xtdpdgmm is up to date (just checked it):
      Code:
      . xtdpdgmm lnexport3 l.lnexport3 lnulc1 lnL dlnreal_K lnumc l.lnintangible_K, ///
      	> model(diff) collapse gmm(lnexport3, lag(2 5)) ///
      	>                      gmm(lnulc1, lag(2 5)) ///
      	>                                          gmm(lnL, lag(2 5)) ///
      	>                                          gmm(dlnreal_K, lag(2 5)) ///
      	>                                          gmm(lnumc, lag(2 5)) ///
      	>                                          gmm(lnintangible_K, lag (2 5)) ///
      	> ///
      	>                      gmm(lnexport3, lag(1 2) diff model(level)) ///
      	>                                          gmm(lnulc1, lag(1 2) diff model(level)) ///
      	>                                          gmm(lnL, lag(1 2) diff model(level)) ///
      	>                                          gmm(dlnreal_K, lag(1 2) diff model(level)) ///
      	>                                          gmm(lnumc, lag(1 2) diff model(level)) ///
      	>                      gmm(lnintangible_K, lag (1 2) diff model (level)) ///
      	> teffects cu vce(r) overid
      	
      	Generalized method of moments estimation
      	         asarray_keys():  3301  subscript invalid
      	     xtdpdgmm_opt::iv():     -  function returned error
      	   xtdpdgmm_opt::init():     -  function returned error
      	             xtdpdgmm():     -  function returned error
      	                <istmt>:     -  function returned error
    6. When I try Jochmans (2020) portmanteau test (after SYS-GMM estimation from the previous point) I get the following error:

      Code:
      	. estat serialpm
      	    xtdpdgmm_serialpm():  3200  conformability error
      	                <istmt>:     -  function returned error
      	r(3200);
    Bun, M. J., & Windmeijer, F. (2010). The weak instrument problem of the system GMM estimator in dynamic panel data models. The Econometrics Journal, 13Econometric reviews, 19(3), 321-340.


    10. Your presentation (2019) slide 94 says: „If there are concerns about the imprecisely estimated optimal weighting matrix, the one-step GMM estimator with robust standard errors might be used instead.“ How can I know if optimal weighting matrix is imprecisely estimated?

    11. Blundell-Bond assumption esentially states that devations from long-run means must not be correlated with the fixed effects for SYS-GMM to be valid. First, can I check the assumption by doing a graph where dl.lnexport3 is on the x-axis, and residuals are on y-axis and if there is no correlation between those two (for the first lets say few periods t<5, since my T=12), then the B-B assumption is fine (lnexport3 is my left-hand side variable in an estimated model)? Roodman (2009) makes such graphs on page 146, but states „In sum, for the particular case where individuals have a common starting point, the validity of system GMM is equivalent to all having achieven mean stationarity by the study period.“ Does that mean that these graphs do show the violation/validity of B-B assumption only if the individuals (firms) have a common starting point, i.e. we look at the same time period for all individuals? Lastly, do residuals have to be logged, since lnexport3 is a natural logarithm of exports? Roodman, D. (2009). A note on the theme of too many instruments. Oxford Bulletin of Economics and statistics, 71(1), 135-158.
    Last edited by Filip Novinc; 25 Apr 2023, 03:15.

    Leave a comment:


  • Sebastian Kripfganz
    replied
    You can use the community-contributed ivreg2 command. Slides 39 to 42 of my 2019 London Stata Conference presentation provide an example of how to replicate xtdpdgmm results with ivreg2. You can then simply amend the latter command to obtain Driscoll-Kraay standard errors (option dkraay()).

    Leave a comment:


  • Mugabil Isayev
    replied
    Dear Sebastian Kripfganz,

    Thank you for the response. Is there any command in Stata where I can incorporate Driscoll Kraay SEs into GMM?

    Leave a comment:


  • Sebastian Kripfganz
    replied
    These two commands do not compute Driscoll Kraay standard errors, sorry.

    Leave a comment:


  • Mugabil Isayev
    replied
    Dear Sebastian Kripfganz,

    I want to simultaneously take into account cross-sectional dependence and endogeneity problems. Is it possible to incorporate Driscoll Kraay standard errors to xtdpdgmm or xtabond2?

    Thanks in advance.

    Leave a comment:


  • Sebastian Kripfganz
    replied
    The first justification would be that you have endogenous regressors but no suitable external instruments; therefore, you are using internal instruments (lagged transformed regressors).

    Compared to the difference-GMM estimator, the justification for the system-GMM estimator would still be that it is more efficient because of the extra instruments it uses. The validity of these extra instruments of course needs to be justified, typically with a difference-in-Hansen test comparing the system-GMM to the difference-GMM estimator. In a nutshell, the arguments are very similar to those for a dynamic model.

    The static model is a special case of the dynamic model without the lagged dependent variable. What is good for the more general dynamic model cannot be bad for the restricted static model.

    Leave a comment:


  • Sarah Magd
    replied
    Originally posted by Sebastian Kripfganz View Post
    1. Yes.
    2. 2SLS is generally inefficient when using panel data. In any case, "2SLS" is not very informative; you would need to be clear about the instruments you are using. It then becomes a question of whether the instruments used in your sys-GMM estimator are beneficial compared to the instruments used in your 2SLS approach. In the first place, you would need to check whether they might require different assumptions for validity.
    3. Reporting serial correlation tests is generally still useful even in static models. For once, they might tell you whether a dynamic model could be reasonable (to account for any serial correlation detected in the static model). If you have predetermined/endogenous variables, serial correlation can still invalidate the instruments in a static model.
    Dear Prof. Sebastian Kripfganz
    When we use sys-GMM with a static model, how should I justify the selection of this estimator? Do I need to do any further steps or compare it with other estimators?
    Or should I use it as a robustness check?
    I am just confused about how I should justify the use of sys-GGM to estimate a static model

    Leave a comment:


  • Sebastian Kripfganz
    replied
    1.1-1.7) I cannot/should not answer those questions without seeing your command line, instruments used, and regression ouput. Eventually, a holistic approach should be used.

    1.8-1.9) Assuming that ind and md are time-invariant, the respective instruments for model(md) will be dropped. In that regard, specification A is more reasonable. Other than that, the specifications can be valid.

    2) The difference-in-Hansen test can be seen as a test comparing two estimators, one with the instruments under investigation and one without them. You would never consider an estimator without instruments for the time dummies, when such time dummies are included as regressors, because then necessary instruments are missing, which leads to weak identification/underidentification of the model. Therefore, the comparison estimator is rubbish and the test not meaningful. In other words, there is no point testing the validity of time dummy instruments. These are always valid.

    3) The test in those rows for the time dummies should always be ignored. It is not meaningful; see 2).

    Maybe I should clarify one aspect: When testing for the Blundell-Bond assumption, we need to compare two estimators that both use instruments for the time dummies. We should therefore (for the purpose of this test), use the time dummy instruments in the transformed model, not the level model !!! Once we have done the test, we can then possibly change the specification by including the time dummy instruments for the level model instead.


    Please understand that I won't be able to continue my detailed answers in that frequence here on Statalist due to other obligations. Such detailed help would normally require a (paid) consultancy agreement.

    Leave a comment:


  • Zainab Mariam
    replied
    Dear Professor Sebastian,

    Many thanks for your time and hard work. Thank you very much for helping out when I needed it. The support that you show is extremely appreciated, Professor!

    Question 1) Suppose the following outcomes of the difference-in-Hansen test.

    Table 1

    Excluding Difference
    Moment conditions chi2 df p chi2 df p
    1, model(fodev) 38.4506 17 0.0021 5.6365 3 0.0580
    2, model(fodev) 32.2911 17 0.0138 11.7960 3 0.0081
    3, model(fodev) 38.2406 17 0.0023 5.8465 3 0.1193
    4, model(fodev) 43.0624 17 0.0005 1.0247 3 0.7953
    5, model(fodev) 42.8693 17 0.0005 1.2178 3 0.7487
    6, model(fodev) 42.9069 17 0.0005 1.1803 3 0.7577
    7, model(fodev) 38.2817 17 0.0022 5.8054 3 0.0775
    8, model(fodev) 39.2517 17 0.0016 4.8354 3 0.0689
    9, model(fodev) 39.2432 17 0.0017 4.8439 3 0.1836
    10, model(fodev) 34.1913 17 0.0079 9.8958 3 0.0195
    11, model(fodev) 41.1743 19 0.0023 2.9128 1 0.0879
    12, model(mdev) 34.6609 19 0.0153 9.4262 1 0.0021
    13, model(level) 29.6163 19 0.1070 3.0659 1 0.1087
    14, model(level) 0.3539 1 0.1517 43.7332 19 0.0510
    model(fodev) . -11 . . . .

    Table 2 (since the row/line labelled “model(level)” in the difference-in-Hansen test is needed, I kept its findings and deleted the rest).
    Excluding Difference
    Moment conditions chi2 df p chi2 df p
    1, model(fodev)
    2, model(fodev)
    3, model(fodev)
    4, model(fodev)
    5, model(fodev)
    6, model(fodev)
    7, model(fodev)
    8, model(fodev)
    9, model(fodev)
    10, model(fodev)
    11, model(fodev)
    12, model(mdev)
    13, model(level)
    14, model(level) 12.0861 7 0.0978 4.1102 2 0.0881
    model(diff) 0.0000 0 . 16.1962 9 0.0629
    model(level) 8.0920 6 0.0448 8.1042 3 0.0447

    Table 3
    Excluding Difference
    Moment conditions chi2 df p chi2 df p
    1, model(fodev)
    2, model(fodev)
    3, model(fodev)
    4, model(fodev)
    5, model(fodev)
    6, model(fodev)
    7, model(fodev)
    8, model(fodev)
    9, model(fodev)
    10, model(fodev)
    11, model(fodev)
    12, model(mdev)
    13, model(level)
    14, model(level) 12.0861 7 0.0397 4.1102 2 0.1364
    model(diff) 0.0000 0 . 16.1962 9 0.0629
    model(level) 8.0920 6 0.0398 8.1042 3 0.1298

    Table 4
    Excluding Difference
    Moment conditions chi2 df p chi2 df p
    1, model(fodev)
    2, model(fodev)
    3, model(fodev)
    4, model(fodev)
    5, model(fodev)
    6, model(fodev)
    7, model(fodev)
    8, model(fodev)
    9, model(fodev)
    10, model(fodev)
    11, model(fodev)
    12, model(mdev)
    13, model(level)
    14, model(level) 12.0861 7 0.1231 4.1102 2 0.0786
    model(diff) 0.0000 0 . 16.1962 9 0.0629
    model(level) 8.0920 6 0.1214 8.1042 3 0.0489

    Table 5
    Excluding Difference
    Moment conditions chi2 df p chi2 df p
    1, model(fodev)
    2, model(fodev)
    3, model(fodev)
    4, model(fodev)
    5, model(fodev)
    6, model(fodev)
    7, model(fodev)
    8, model(fodev)
    9, model(fodev)
    10, model(fodev)
    11, model(fodev)
    12, model(mdev)
    13, model(level)
    14, model(level) 12.0861 7 0.0779 4.1102 2 0.1281
    model(diff) 0.0000 0 . 16.1962 9 0.0629
    model(level) 8.0920 6 0.1095 8.1042 3 0.1087

    I kindly ask you please the following questions!

    1.1) Which table(s) of the above findings of the difference-in-Hansen test indicate that the Difference GMM estimator is fine? Which table(s) of the above findings of the difference-in-Hansen test indicate that the instruments for the Difference GMM estimator are valid?

    1.2) Which table(s) of the above findings of the difference-in-Hansen test indicate that the variables satisfy/violate the additional Blundell-Bond assumption (sufficient: mean stationarity)?

    1.3) Which table(s) of the above findings of the difference-in-Hansen test indicate that I can instrument the variables in the level model?

    1.4) Which table(s) of the above findings of the difference-in-Hansen test indicate that I can apply the System GMM estimator?

    1.5) Which table(s) of the above outcomes of the difference-in-Hansen test indicate that the Difference GMM estimator is superior to the System GMM estimator?

    1.6) Which table(s) of the above outcomes of the difference-in-Hansen test indicate that the System GMM estimator is superior to the Difference GMM estimator? Which of the above findings of the difference-in-Hansen test indicate that the additional instruments for the level model are valid?

    1.7) What do the above findings of the difference-in-Hansen test indicate regarding the variables classification whether exogenous, predetermined, or endogenous?

    1.8) In case I can apply the System GMM estimator, are the following codes correct to apply the System GMM estimator using your xtdpdgmm command?

    A) In this code, I specified ‘model(fod)’ as a separate option in the xtdpdgmm command line, I instrument all the variables (except the dummies) for the differenced model, I instrument all the variables (including the dummies) for the level model, I put ‘model(level)’ in the iv() option for the dummies as follows.

    xtdpdgmm L(0/1).y L(0/1).x1 L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9 x10 i.ind mn cf cf*L.x1, model(fod) collapse gmm(y, lag(1 3)) gmm(L.x1, lag(1 3)) gmm(L.x2, lag(0 2)) gmm(L.x3, lag(0 2)) gmm(L.x4, lag(0 2)) gmm(L.x5, lag(0 2)) gmm(L.x6, lag(0 2)) gmm(L.x7, lag(0 2)) gmm(L.x8, lag(0 2)) gmm(L.x9, lag(0 2)) gmm(x10, lag(0 2)) gmm(x10, lag(0 0) model(md)) gmm(cf*L.x1, lag(1 3)) gmm(y, lag(1 1) diff model(level)) gmm(L.x1, lag(1 1) diff model(level)) gmm(L.x2, lag(0 0) diff model(level)) gmm(L.x3, lag(0 0) diff model(level)) gmm(L.x4, lag(0 0) diff model(level)) gmm(L.x5, lag(0 0) diff model(level)) gmm(L.x6, lag(0 0) diff model(level)) gmm(L.x7, lag(0 0) diff model(level)) gmm(L.x8, lag(0 0) diff model(level)) gmm(L.x9, lag(0 0) diff model(level)) gmm(x10, lag(0 0) diff model(level)) gmm(cf*L.x1, lag(1 1) diff(model(level)) iv(i.ind, model(level)) iv(mn, model(level)) iv(cf, model(level)) two small vce(r) overid

    B) In this code, I specified ‘model(fod)’ as a separate option in the xtdpdgmm command line, I instrument all the variables (except the dummies) for the differenced model, I instrument all the variables (including the dummies) for the level model, I put ‘model(md)’ in the iv() option for the dummies as follows.

    xtdpdgmm L(0/1).y L(0/1).x1 L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9 x10 i.ind mn cf cf*L.x1, model(fod) collapse gmm(y, lag(1 3)) gmm(L.x1, lag(1 3)) gmm(L.x2, lag(0 2)) gmm(L.x3, lag(0 2)) gmm(L.x4, lag(0 2)) gmm(L.x5, lag(0 2)) gmm(L.x6, lag(0 2)) gmm(L.x7, lag(0 2)) gmm(L.x8, lag(0 2)) gmm(L.x9, lag(0 2)) gmm(x10, lag(0 2)) gmm(x10, lag(0 0) model(md)) gmm(cf*L.x1, lag(1 3)) gmm(y, lag(1 1) diff model(level)) gmm(L.x1, lag(1 1) diff model(level)) gmm(L.x2, lag(0 0) diff model(level)) gmm(L.x3, lag(0 0) diff model(level)) gmm(L.x4, lag(0 0) diff model(level)) gmm(L.x5, lag(0 0) diff model(level)) gmm(L.x6, lag(0 0) diff model(level)) gmm(L.x7, lag(0 0) diff model(level)) gmm(L.x8, lag(0 0) diff model(level)) gmm(L.x9, lag(0 0) diff model(level)) gmm(x10, lag(0 0) diff model(level)) gmm(cf*L.x1, lag(1 1) diff(model(level)) iv(i.ind, model(md)) iv(mn, model(md)) iv(cf, model(md)) two small vce(r) overid

    Where:
    y is the dependent variable;
    L.y is the lagged dependent variable as a regressor (L.y is predetermined);
    L.x1 is the independent variable (L.x1 is endogenous);
    The control variables L.x2, L.x3, L.x4, L.x5, L.x6, L.x7, L.x8, L.x9 are predetermined;
    The control variable x10 (firm age) is exogenous;
    ind is industry dummies;
    mn is country dummies;
    cf is a dummy variable that takes the value of 1 for the 3 years 2008, 2009, and 2010;
    cf*L.x1 is an interaction between the dummy variable cf and the independent variable L.x1.

    1.9) If the previous codes are incorrect to apply the System GMM estimator, what do I have to add, delete, amend in the previous codes to apply the System GMM estimator using your xtdpdgmm command?

    Are there other codes which are more appropriate to apply the System GMM estimator using your xtdpdgmm command?

    Question 2) Regarding post #514 “Here, it seems that the only instruments specified for the level model are the time dummies. In this case, the difference-in-Hansen test for them is not meaningful. We cannot leave out the instruments for the time dummies. Because the time dummies are the only instruments for the level model, the Blundell-Bond assumption does not apply here.”.

    Sorry, I did not get what you mean by that. I kindly ask you please to explain what you mean.

    Question 3) Regarding post #520 point 1.1) “You would need to be more specific about which of these difference-in-Hansen tests you are refering to. Each row in the table is a separate test. For example, on page 96, the test in the row labelled "5, model(level)" is not very useful, because removing the instruments for the time dummies (while keeping the time dummies as regressors) does not make much sense and may result in serious weak-instruments problems or even underidentification.”.

    Sorry for not being specific there. In all the tables of the difference-in-Hansen test on pages 96, 109, 113, and 123, the row labelled “5, model(level); 8, model(level); 8, model(level); 8, model(level)”, respectively. The respective labels for those rows of the test output are for the time dummies (according to the list of instruments below the regression output). Thus, does that mean the difference-in-Hansen test on pages 96, 109, 113, and 123 is not meaningful? Does that mean the Blundell-Bond assumption does not apply there in those tables on pages 96, 109, 113, and 123?

    Your input is so valuable. The work you do is very important and so appreciated. I am very grateful to you for all your patience, help and effort, Professor!

    Leave a comment:


  • Sebastian Kripfganz
    replied
    Zainab Mariam

    1.1) You would need to be more specific about which of these difference-in-Hansen tests you are refering to. Each row in the table is a separate test. For example, on page 96, the test in the row labelled "5, model(level)" is not very useful, because removing the instruments for the time dummies (while keeping the time dummies as regressors) does not make much sense and may result in serious weak-instruments problems or even underidentification.

    1.2) I am not sure I understand the question. There is no such minimum number of variables.

    2) Yes to all of them.

    3) Yes to all of them.

    4) The respective labels for the rows of the test output tell you which instruments are being tested; compare with the list of instruments below the regression output.

    5.1) Not necessarily. A small estimate of the lagged dependent variable's coefficient might potentially be a consequence of a large bias of the difference GMM estimator if the true coefficient is close to 1. Thus, it is difficult to learn about potential problems of the estimator from actual coefficient estimates. You could again use the difference GMM estimator with additional nonlinear moment conditions, which is less prone to problems under high persistence. If that estimator yields a large estimate of the autoregressive coefficient, this might indicate potential problems of the difference GMM estimator (without such nonlinear moment conditions). And then you can also compare the two estimates; if they are very different, the estimator without the nonlinear moment conditions probably has problematic properties.

    5.2) That's the typical application, yes.

    5.3) Correct.

    5.4) You can still check for correct variable classification with the system GMM estimator. (Essentially, everything you can do with the difference GMM estimator, you can also do with the system GMM estimator). However, this should generally be done first with the difference GMM estimator (ideally also using nonlinear moment conditions), to avoid contamination of the tests with a potential invalidity of the mean stationarity assumption.

    5.5) That's the typical application, yes.

    Leave a comment:


  • Sebastian Kripfganz
    replied
    Originally posted by Sarah Magd View Post
    I specify my model in a static way (i.e., without including the lagged dependent variable as a regressor).

    1. Can we still use the sys-GMM to estimate this static regression?
    2. How should I justify the use of the sys-GMM to estimate this static regression? (i.e., is it more efficient or robust than the 2SLS regression?)
    3. Do I still need to report the Arellano-Bond statistics?[/B]
    1. Yes.
    2. 2SLS is generally inefficient when using panel data. In any case, "2SLS" is not very informative; you would need to be clear about the instruments you are using. It then becomes a question of whether the instruments used in your sys-GMM estimator are beneficial compared to the instruments used in your 2SLS approach. In the first place, you would need to check whether they might require different assumptions for validity.
    3. Reporting serial correlation tests is generally still useful even in static models. For once, they might tell you whether a dynamic model could be reasonable (to account for any serial correlation detected in the static model). If you have predetermined/endogenous variables, serial correlation can still invalidate the instruments in a static model.

    Leave a comment:


  • Sarah Magd
    replied
    Dear Prof. @Kripfganz,

    I specify my model in a static way (i.e., without including the lagged dependent variable as a regressor).

    1. Can we still use the sys-GMM to estimate this static regression?
    2. How should I justify the use of the sys-GMM to estimate this static regression? (i.e., is it more efficient or robust than the 2SLS regression?)
    3. Do I still need to report the Arellano-Bond statistics?


    Leave a comment:


  • Zainab Mariam
    replied
    Dear Professor Sebastian,

    I am so thankful for what you did. You are so helpful. I do appreciate the way you are teaching and supporting me. Your assistance means a lot to me, Professor!

    1) Regarding post #514 “Here, it seems that the only instruments specified for the level model are the time dummies. In this case, the difference-in-Hansen test for them is not meaningful. We cannot leave out the instruments for the time dummies. Because the time dummies are the only instruments for the level model, the Blundell-Bond assumption does not apply here.”.

    Thus, I have the following questions, please!

    1.1) Does that mean the difference-in-Hansen test on pages 96, 109, 113, and 123 is not meaningful? Does that mean the Blundell-Bond assumption does not apply there in those tables on pages 96, 109, 113, and 123?

    1.2) At least how many variables do I have to instrument for the level model in order for the difference-in-Hansen test to be meaningful and for the Blundell-Bond assumption to be applied?

    2) Regarding the meaning of the following iv() options, given I have not specified ‘model(diff)’ as a separate option in your xtdpdgmm command line, is the following meaning of iv() options correct?

    2.1) iv(x, lag( ) diff model(level)): produces differenced instruments for the level model?

    2.2) iv(x, lag( ) diff): produces differenced instruments for the level model?

    2.3) iv(x, lag( ) model(diff) diff): produces differenced instruments for the differenced model?

    2.4) iv(x, lag( ) model(level)): produces level instruments for the level model?

    2.5) iv(x, lag( )): produces level instruments for the level model?

    2.6) iv(x, lag( ) model(diff)): produces level instruments for the differenced model?

    3) Regarding the meaning of the following iv() options, given I have specified ‘model(FOD)’ as a separate option in the xtdpdgmm command line, is the following meaning of iv() options correct?

    3.1) iv(x, lag( ) diff model(level)): produces differenced instruments for the level model?

    3.2) iv(x, lag( ) diff): produces differenced instruments for the FOD model?

    3.3) iv(x, lag( ) model(diff) diff): produces differenced instruments for the differenced model?

    3.4) iv(x, lag( ) model(level)): produces level instruments for the level model?

    3.5) iv(x, lag( )): produces level instruments for the FOD model?

    3.6) iv(x, lag( ) model(diff)): produces level instruments for the differenced model?

    4) Regarding post #506 point 6) “With the difference GMM estimator, the difference-in-Hansen test can still be useful to evaluate the validity of specific instrument sets.…”.

    Thus, which instrument sets specifically the difference-in-Hansen test with the Difference GMM estimator can evaluate their validity?

    5) To check my understanding, please, correct me if I am wrong!

    5.1) The coefficient of L.y (L.y is the lagged dependent variable) based on the Difference GMM estimator indicates whether the dependent variable (y) is persistent, and hence, it indicates if the Difference GMM estimator poorly behaves? The lagged dependent variable’s coefficient obtained from applying the Difference GMM estimator refers to whether the dependent variable (y) is close to a random walk and if the Difference GMM estimator performs poorly? If the coefficient of L.y is close to 1, that indicates that the dependent variable (y) is persistent and the Difference GMM estimator yields poor performance due to the poor instruments?

    5.2) Difference-in-Hansen test with the Difference GMM checks for variables classification?

    5.3) Difference-in-Hansen test with the Difference GMM cannot check for the additional Blundell-Bond assumption (sufficient: mean stationarity)?

    5.4) Difference-in-Hansen test with the System GMM cannot check for variables classification?

    5.5) Difference-in-Hansen test with the System GMM checks for the additional Blundell-Bond assumption (sufficient: mean stationarity)?

    Many thanks for doing what you do! Your patience, help and effort are greatly appreciated, Professor!

    Leave a comment:


  • Sebastian Kripfganz
    replied
    1.1) The approach can also be used with the system GMM estimator if you are confident that the additional assumption for validity of the instruments in the level model is satisfied.

    1.2) With the difference GMM estimator alone, you cannot test the additional assumption for the system GMM estimator.

    2.1) You do not lose much with the nonlinear estimator compared to the difference GMM estimator. So, yes, it is often preferable to use the nonlinear estimator.

    2.2-2.8) All of these specifications are valid. Some of them are unusual/unconventional, e.g. 2.4).

    2.9) Here, the option gmm(L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9, lag(1 3)) would only be valid if all of these variables were strictly exogenous.

    2.10) Somewhere earlier in this thread I gave examples for different estimators, including the Ahn-Schmidt estimator.

    2.11) model(mdev) should only be specified for strictly exogenous variables. Otherwise, both model(fod) and model(diff) are fine. (Just remember that in general the admissible lags differ for the two models; e.g. for an endogenous variable, the first admissible lag is 1 with model(fod) but 2 with model(diff).)

    3) If you do not have multiple instrument sets for model(level), then the difference-in-Hansen test does not perform a separate test for it.

    4) This might be because the industry dummies are time-invariant. Such variables can only be specified for model(level).

    Leave a comment:


  • Zainab Mariam
    replied
    Dear Professor Sebastian,

    Even though I may not say it all the time, I do appreciate all that you do, Professor! I do not know what to say. Much obliged!

    1) Regarding post #506 point 6) “With the difference GMM estimator, the difference-in-Hansen test can still be useful to evaluate the validity of specific instrument sets. This could for example help to decide whether variables should be classified as endogenous, predetermined, or exogenous; see the model selection section of my presentation.”.

    Thus, I have the following questions, please!

    1.1) Does it mean the difference-in-Hansen test cannot help to decide whether variables should be classified as endogenous, predetermined, or exogenous if we apply the difference-in-Hansen test with the System GMM estimator?

    1.2) Does it mean the difference-in-Hansen test cannot help to check if the variables satisfy the additional Blundell-Bond assumption (sufficient: mean stationarity) if we apply the difference-in-Hansen test with the Difference GMM estimator? Does it mean the difference-in-Hansen test cannot help to check if I can instrument the variables in the level model when we apply the difference-in-Hansen test with the Difference GMM estimator? Does it mean the difference-in-Hansen test cannot help to check if I can apply the System GMM estimator when we apply the difference-in-Hansen test with the Difference GMM estimator?

    2) Regarding post #508 point 1) “In principle, the MMSC can be used for selecting between the difference and system GMM estimator, yes. If different criteria give you different answers, I am afraid then the decision is still up to you. You will then need to weigh the benefits and shortcomings of the two estimators. As mentioned earlier, a good compromise might be the difference GMM estimator plus nonlinear moment conditions (Ahn-Schmidt).”. And regarding post #504 point 3) “… Alternatively, you could use the nonlinear Ahn and Schmidt (1995, Journal of Econometrics) estimator, which also mitigates the weak-instruments problem but does not require the additional system GMM assumptions.”.

    Thus, I have the following questions, please!

    2.1) Does it mean it is better to apply the nonlinear Ahn and Schmidt estimator? If so, are the following codes correct?

    2.2) In this code, I specified ‘model(fod)’ as a separate option in the xtdpdgmm command line, I put ‘model(md)’ in the iv() option for the dummies.

    xtdpdgmm L.(0/1) y L.(x1 x2 x3 x4 x5 x6 x7 x8 x9) x10 i.ind mn cf cf*L.x1, model(fod) collapse gmm(y, lag(1 4)) gmm(L.x1, lag(1 4)) gmm(L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9, lag(0 3)) gmm(x10, lag(0 2)) gmm(x10, model(md) lag(0 0)) iv(i.ind, model(md)) iv(mn, model(md)) iv(fc, model(md)) gmm(cf*L.x1, lag(1 3)) nl(noserial)) teffects two small vce(robust, dc) overid

    2.3) In this code, I put ‘model(level)’ in the iv() option for the dummies.

    xtdpdgmm L.(0/1) y L.(x1 x2 x3 x4 x5 x6 x7 x8 x9) x10 i.ind mn cf cf*L.x1, model(fod) collapse gmm(y, lag(1 4)) gmm(L.x1, lag(1 4)) gmm(L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9, lag(0 3)) gmm(x10, lag(0 2)) gmm(x10, model(md) lag(0 0)) iv(i.ind, model(level)) iv(mn, model(level)) iv(fc, model(level)) gmm(cf*L.x1, lag(1 3)) nl(noserial)) teffects two small vce(robust, dc) overid

    2.4) In this code, I put ‘diff’ in the iv() option for the dummies.

    xtdpdgmm L.(0/1) y L.(x1 x2 x3 x4 x5 x6 x7 x8 x9) x10 i.ind mn cf cf*L.x1, model(fod) collapse gmm(y, lag(1 4)) gmm(L.x1, lag(1 4)) gmm(L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9, lag(0 3)) gmm(x10, lag(0 2)) gmm(x10, model(md) lag(0 0)) iv(i.ind, diff) iv(mn, diff) iv(fc, diff) gmm(cf*L.x1, lag(1 3)) nl(noserial)) teffects two small vce(robust, dc) overid

    2.5) In this code, I put ‘diff model(diff)’ in the iv() option for the dummies.

    xtdpdgmm L.(0/1) y L.(x1 x2 x3 x4 x5 x6 x7 x8 x9) x10 i.ind mn cf cf*L.x1, model(fod) collapse gmm(y, lag(1 4)) gmm(L.x1, lag(1 4)) gmm(L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9, lag(0 3)) gmm(x10, lag(0 2)) gmm(x10, model(md) lag(0 0)) iv(i.ind, diff model(diff)) iv(mn, diffmodel(diff)) iv(fc, diffmodel(diff)) gmm(cf*L.x1, lag(1 3)) nl(noserial)) teffects two small vce(robust, dc) overid

    2.6) In this code, I put ‘model(diff)’ in the iv() option for the dummies.

    xtdpdgmm L.(0/1) y L.(x1 x2 x3 x4 x5 x6 x7 x8 x9) x10 i.ind mn cf cf*L.x1, model(fod) collapse gmm(y, lag(1 4)) gmm(L.x1, lag(1 4)) gmm(L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9, lag(0 3)) gmm(x10, lag(0 2)) gmm(x10, model(md) lag(0 0)) iv(i.ind, model(diff)) iv(mn, model(diff)) iv(fc, model(diff)) gmm(cf*L.x1, lag(1 3)) nl(noserial)) teffects two small vce(robust, dc) overid

    2.7) In this code, I put ‘diff model(level)’ in the iv() option for the dummies.

    xtdpdgmm L.(0/1) y L.(x1 x2 x3 x4 x5 x6 x7 x8 x9) x10 i.ind mn cf cf*L.x1, model(fod) collapse gmm(y, lag(1 4)) gmm(L.x1, lag(1 4)) gmm(L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9, lag(0 3)) gmm(x10, lag(0 2)) gmm(x10, model(md) lag(0 0)) iv(i.ind, diff model(level)) iv(mn, diffmodel(level)) iv(fc, diffmodel(level)) gmm(cf*L.x1, lag(1 3)) nl(noserial)) teffects two small vce(robust, dc) overid

    2.8) In this code, I did not put anything in the iv() option for the dummies.

    xtdpdgmm L.(0/1) y L.(x1 x2 x3 x4 x5 x6 x7 x8 x9) x10 i.ind mn cf cf*L.x1, model(fod) collapse gmm(y, lag(1 4)) gmm(L.x1, lag(1 4)) gmm(L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9, lag(0 3)) gmm(x10, lag(0 2)) gmm(x10, model(md) lag(0 0)) iv(i.ind) iv(mn) iv(fc) gmm(cf*L.x1, lag(1 3)) nl(noserial)) teffects two small vce(robust, dc) overid

    2.9) In this code, I specified ‘model(mdev)’ as a separate option in the xtdpdgmm command line, I put ‘model(diff)’ in the gmm() for the endogenous variables (y, L.x1), I put ‘norescale’ in the iv() for the exogenous variable (x10), I put ‘model(md)’ in the iv() option for the dummies, and I did not put any option in gmm() for the predetermined variables (L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9).

    xtdpdgmm L.(0/1) y L.(x1 x2 x3 x4 x5 x6 x7 x8 x9) x10 i.ind mn cf cf*L.x1, model(mdev) collapse gmm(y, lag(2 4) model(diff)) gmm(L.x1, lag(2 4) model(diff)) gmm(L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9, lag(1 3)) iv(x10, norescale) gmm(x10, model(md) lag(0 0)) iv(i.ind, model(md)) iv(mn, model(md)) iv(fc, model(md)) gmm(cf*L.x1, lag(2 4) model(diff)) nl(noserial)) teffects two small vce(robust, dc) overid

    2.10) If none of the previous codes is correct, what is the correct code I have to use in order to implement the nonlinear Ahn and Schmidt estimator using your xtdpdgmm command? Where: y is the dependent variable; L.y is the lagged dependent variable as a regressor (L.y is predetermined); L.x1 is the independent variable (L.x1 is endogenous); The control variables L.x2, L.x3, L.x4, L.x5, L.x6, L.x7, L.x8, L.x9 are predetermined; The control variable x10 (firm age) is exogenous; ind is industry dummies; mn is country dummies; cf is a dummy variable that takes the value of 1 for the 3 years 2008, 2009, and 2010; cf*L.x1 is an interaction between the dummy variable cf and the independent variable L.x1.

    2.11) To apply the nonlinear Ahn and Schmidt estimator, is it better to specify ‘model(fod)’ or ‘model(diff)’ or ‘model(mdev)’ as a separate option in the xtdpdgmm command line?

    3) What if the Difference-in-Hansen test’s results do not obtain “model(level)” in the last line/row of the Difference-in-Hansen test table? What does that indicate?

    4) Is it normal for all the industry dummies to be omitted if I put ‘md’ in the iv() option for the industry dummies along with not typing ‘teffects’ in the xtdpdgmm command line?

    Also, is it normal for more than one industry dummy to be omitted if I put ‘md’ in the iv() option for the industry dummies even with typing ‘teffects’ in the xtdpdgmm command line? What are the iv() options that lead the dummies to be omitted?

    Sorry to keep asking you my questions, but I would not have understood this without your assistance. Please accept my deepest gratitude. Your patience, help and effort are greatly appreciated, Professor! Thank you very much for all you do.

    Leave a comment:

Working...
X