Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Zainab Mariam
    replied
    Dear Professor Sebastian,

    Many thanks for your swift valuable response. Your cooperation and support are priceless. Indeed, saying “thank you very much” is not enough. I am very grateful to you for all your help and effort, Professor!

    1) To use your xtdpdgmmfe command to apply Chudik-Pesaran (2022) estimator for unbalanced dynamic panel data with at least one endogenous regressor, I have the following questions, please!

    1.1) Are the following codes correct?

    A) xtdpdgmmfe y L2.y L(1/2).x1 L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9 x10 Industry1 Industry2 Industry3 Industry4 Industry5 Industry6 Industry7 Industry8 mn cf cf*L.x1, exogenous(x10 Industry1 Industry2 Industry3 Industry4 Industry5 Industry6 Industry7 Industry8 mn cf) predetermined(L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9 L2.y) endogenous(L(1/2).x1 cf*L.x1) initdev collapse teffects igmm vce(robust, dc) center

    B) xtdpdgmmfe y L2.y L(1/2).x1 L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9 x10 i.ind mn cf cf*L.x1, exogenous(x10 i.ind mn cf) predetermined(L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9 L2.y) endogenous(L(1/2).x1 cf*L.x1) initdev collapse teffects igmm vce(robust, dc) center

    1.2) If none of the previous codes is correct, what is the correct code I have to use in order to implement Chudik-Pesaran (2022) estimator using your xtdpdgmmfe command?
    Are there other codes which are more appropriate to apply Chudik-Pesaran (2022) estimator using your xtdpdgmmfe command? Where: y is the dependent variable; L2.y is the second lag of the dependent variable as a regressor (L2.y is predetermined); L.x1 is the independent variable (L.x1 is endogenous); L2.x1 is the first lag of the independent variable L.x1; the control variables L.x2, L.x3, L.x4, L.x5, L.x6, L.x7, L.x8, L.x9 are predetermined; the control variable x10 (firm age) is exogenous; ind is industry dummies; mn is country dummies; cf is a dummy variable that takes the value of 1 for the 3 years 2008, 2009, and 2010; cf*L.x1 is an interaction between the dummy variable cf and the independent variable L.x1.

    1.3) To use your xtdpdgmmfe command, do I have by myself to type the ‘exogenous’ option and the ‘endogenous’ option, open their corresponding brackets, and fill in them?

    2) When using your xtdpdgmmfe command, I have the following questions, please!

    2.1) Can I specify the dummy variables (such as industry dummies, country dummies, …) as exogenous variables and put them in the brackets of the ‘exogenous’ option?

    2.2) Does the xtdpdgmmfe command instrument the dummies (industry dummies, country dummies, …) in the differenced model or in the level model?

    2.3) Does the xtdpdgmmfe command use the differenced instruments or the level instruments for the dummies (industry dummies, country dummies, …)?

    3) We can modify the xtdpdgmm command line to try several trials. Thus, can we do the same when using the xtdpdgmmfe command? If so, how?

    4) When applying the two-step System GMM estimator using your xtdpdgmmfe command, it uses model(diff). Thus, how to amend the code of the xtdpdgmmfe command in order to use model(fod) instead of model(diff) to apply the two-step System GMM estimator?

    5) When applying the two-step System GMM estimator using your xtdpdgmmfe command, can I include the option ‘orthogonal’ in the code?

    6) To apply Hayakawa, Qi, and Breitung (2019) estimator for unbalanced dynamic panel data with at least one endogenous regressor, I have the following questions, please!

    6.1) Are the results of Hayakawa, Qi, and Breitung (2019) estimator identical regardless whether I use your xtdpdgmm command or your xtdpdgmmfe command?

    6.2) To use your xtdpdgmmfe command to apply Hayakawa, Qi, and Breitung (2019) estimator, will I lose an additional observation for each firm?

    6.3) When using your xtdpdgmmfe command to apply Hayakawa, Qi, and Breitung (2019) estimator, do I have to specify the option curtail() for the exogenous variables, and another option curtail() for the endogenous variables, and another curtail() option for the predetermined variables? i.e., Do I have to specify the option curtail() three times when I have three variables’ classifications?

    6.4) To apply Hayakawa, Qi, and Breitung (2019) estimator using your xtdpdgmmfe command, are the following codes correct?

    A) xtdpdgmmfe y L2.y L(1/2).x1 L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9 x10 i.ind mn cf cf*L.x1, exogenous(x10 i.ind mn cf) predetermined(L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9 L2.y) endogenous(L(1/2).x1 cf*L.x1) initdev collapse curtail(1) orthogonal nonl teffects onestep

    B) xtdpdgmmfe y L2.y L(1/2).x1 L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9 x10 i.ind mn cf cf*L.x1, exogenous(x10 i.ind mn cf) curtail(0) predetermined(L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9 L2.y) curtail(1) endogenous(L(1/2).x1 cf*L.x1) curtail(2) initdev collapse orthogonal nonl teffects onestep

    6.5) If the previous codes are incorrect to apply Hayakawa, Qi, and Breitung (2019) estimator using your xtdpdgmmfe command, what do I have to add, delete, amend in the previous codes to apply Hayakawa, Qi, and Breitung (2019) estimator using your xtdpdgmmfe command?

    Are there other codes which are more appropriate to apply Hayakawa, Qi, and Breitung (2019) estimator using your xtdpdgmmfe command? Where: y is the dependent variable; L2.y is the second lag of the dependent variable as a regressor (L2.y is predetermined); L.x1 is the independent variable (L.x1 is endogenous); L2.x1 is the first lag of the independent variable L.x1; the control variables L.x2, L.x3, L.x4, L.x5, L.x6, L.x7, L.x8, L.x9 are predetermined; the control variable x10 (firm age) is exogenous; ind is industry dummies; mn is country dummies; cf is a dummy variable that takes the value of 1 for the 3 years 2008, 2009, and 2010; cf*L.x1 is an interaction between the dummy variable cf and the independent variable L.x1.

    7) If one of the control variables is measured by the natural logarithm, do we consider that control variable endogenous or predetermined?

    8) The main independent variable of my regression model is L.x1 (L.x1 is endogenous). Also, my regression model includes L2.x1 (L2.x1 is the first lag of the independent variable L.x1) as a regressor. Therefore, to use your xtdpdgmmfe command, do I have to specify L2.x1 in the ‘endogenous’ option or in the ‘predetermined’ option?

    Your patience, support and effort are highly appreciated, Professor!

    Leave a comment:


  • Sebastian Kripfganz
    replied
    1) Please see post #450 for examples of different estimators, including the Ahn-Schmidt GMM estimator.

    2) Computational speed depends on many factors, including your hardware and the size of the data set. Because the Ahn-Schmidt GMM estimator is a nonlinear estimator and requires iterative optimization, it naturally takes longer than linear estimators. You should not normally include Blundell-Bond type instruments for the level model, which would otherwise create potential multicollinearity problems, which in turn can make it difficult for the numerical algorithm to converge.

    3) This dummy variable is time varying; it changes its value from 0 to 1 and back to 0 at certain points in time.

    4.1) Under absence of serial error correlation, the lags of the dependent variable can be instrumented starting with the first lag of the dependent variable in the FOD model.
    4.1.A) Yes; see 4.1).
    4.1.B) Yes; the lagged dependent variable is uncorrelated with the FOD-transformed error term (and so is the second lag). Hence, they qualify as instruments.
    4.1.C) No; see above.

    4.2) For difference GMM, the first suitable lag of the dependent variable as an instrument is lag 2.
    4.2.A) Yes.
    4.2.B) No.

    5.1) If L.x1 is endogenous, then L2.x1 is a valid instrument in the FOD model.
    5.1.A) Yes.
    5.1.B) No.

    6) I do not understand this question. In general, differencing the instruments (for the differenced model) still yields valid (and non-redundant) instruments.

    7) There is no clear criterion. T is small when the Nickell bias due to the correlation of the lagged dependent variable with the fixed effects is "too big". This depends (among other things) on the (true but unknown) persistence of the dependent variable. The higher the persistence, the larger the bias, the larger T would be required to no longer be seen as small. It is even less obvious, when N should be considered small. If you make seemingly innocuous changes to your estimator - e.g. changing the maximum lag order for the instruments from, say, 4 to 5 - and your estimates change substantially, this is typically a sign that your N is small. With large N, such changes should hardly matter.

    Leave a comment:


  • Zainab Mariam
    replied
    Dear Professor Sebastian,

    Thanks again for your time and hard work! The assistance you show is extremely appreciated. I should acknowledge that I could not have pulled this off without your support, Professor! I still have the following questions, please!

    1) Regarding your post #516 point 2.10) “Somewhere earlier in this thread I gave examples for different estimators, including the Ahn-Schmidt estimator.
    Thus, I searched for an example you gave for the nonlinear Ahn-Schmidt estimator, but could not know which example you meant. Therefore, I kindly ask you please for the one (the example for the nonlinear Ahn-Schmidt estimator) you referred to.

    2) Is it normal in Stata to take a long time when applying the nonlinear Ahn-Schmidt estimator?

    Is there any way to quicken performing the nonlinear Ahn-Schmidt estimator?

    3) The dummy variable ‘cf’ that takes the value of 1 for the 3 years 2008, 2009, and 2010 (cf takes the value of 1 for the 3 years 2008, 2009, and 2010, while it takes the value of 0 for the years before 2008 and for the years after 2010). Thus, my question is: Is this dummy variable ‘cf’ considered a time-variant or a time-invariant variable?

    4) My regression model is dynamic panel data. It also includes L2.y as a regressor (L2.y is the second lag of the dependent variable y). Thus, I have the following questions, please!

    4.1) For the FOD estimator using your xtdpdgmm command, which lag should the instruments for the dependent variable y start from?

    4.1.A) Is it right to use L.y as an instrument for the dependent variable y, given that L.y is already included in the regression model as a regressor?

    4.1.B) Is it right to use L2.y as an instrument for the dependent variable y, given that L2.y is already included in the regression model as a regressor?

    4.1.C) Do I have to start with L3.y as instruments for the dependent variable y, given that L.y and L2.y are already included in the regression model as regressors?

    4.2) For the Difference GMM estimator, which lag should the instruments for the dependent variable y start from?

    4.2.A) Is it right to use L2.y as an instrument for the dependent variable y, given that L2.y is already included in the regression model as a regressor?

    4.2.B) Do I have to start with L3.y as an instrument for the dependent variable y, given that L2.y is already included in the regression model as a regressor?

    5) The main independent variable of my regression model is L.x1 (L.x1 is endogenous). Also, my regression model includes L2.x1 as a regressor. Therefore, I have the following questions, please!

    5.1) To apply the FOD estimator using your xtdpdgmm command, which lag should the instruments for the independent variable L.x1 start from?

    5.1.A) Is it right to use L2.x1 as an instrument for the independent variable L.x1, given that L2.x1 is already included in the regression model as a regressor?

    5.1.B) Do I have to start with L3.x1 as an instrument for the independent variable L.x1, given that L2.x1 is already included in the regression model as a regressor?

    6) If the differenced instruments are used for the differenced model, will these differenced instruments be omitted?

    7) Usually, it is written {T small; N small}, {T small; N large}; {T large; N small}; {T large; N large}. Thus, my question is: What is the criterion to decide whether they (i.e., T, N) are small or large?

    The work you do is great and so appreciated. Many thanks once again for all that you do, Professor!

    Leave a comment:


  • Sebastian Kripfganz
    replied
    If you have variables that are completely exogenous (which includes uncorrelatedness with the unobserved group-specific effects), then you can use them as standard instruments for the level model. This way, the correlation of the instruments with the regressors is maximized. This typically applies to time dummies and other dummy variables (e.g. industry dummies).

    Leave a comment:


  • Mugi Jang
    replied
    Dear professor Sebastian Kripfganz
    when we implement system gmm in which equation do we have to include the standard instrumental variables? level or difference equation or both?
    In time dummy case, xtdpdgmm automatically insert time dummies in the level equation with the teffects optioin. which strategy is the best one?
    In my case according to the choice of equation type, the estimation results differ. So what is the right thing?
    Thanks lot!
    Last edited by Mugi Jang; 25 May 2023, 01:33.

    Leave a comment:


  • Sebastian Kripfganz
    replied
    That's correct. Without the lagged dependent variable, it becomes a static model, similar to xtivreg, fe. You normally still need the same tests: the Hansen test for the validity of the overidentifying restrictions; and even the Arellano-Bond test for serial correlation, because the GMM-type instruments with lagged variables typically still rely on the assumption of no serial correlation in the untransformed errors.

    Leave a comment:


  • Mugi Jang
    replied
    Dear professor Sebastian Kripfganz
    Roodman said that xtabond2 does not require the lagged dependent variable to appear on the right-hand side(Roodman,2009.p.127.) .Then does the model change from dynamic to static and can we interpret the results as same as xtivreg,fe ? with applying gmm stype instrument for the other independent variables? In this case what kinds of test do the model have to pass?

    Leave a comment:


  • Sebastian Kripfganz
    replied
    I recommend to have a look at the tests implemented in the underid command; see slides 43 to 47 of my 2019 London Stata Conference presentation.

    Leave a comment:


  • Mugi Jang
    replied
    Dear professor Sebastian Kripfganz
    in diff or system gmm , in additio to hansen-sagan overidentifying restrictions test, is there any relevance test for the gmm or standard instruments?

    Leave a comment:


  • Sebastian Kripfganz
    replied
    Indeed, large sample sizes make it easier to detect violations of the model assumptions. Unfortunately, in your case the difference-in-Hansen tests are not helpful for finding a particular violation. I am afraid, further guidance needs to come from economic theory. For example, there might be relevant omitted variables, or it might be worth considering whether the timing of the effects is appropriate (i.e. whether contemporaneous regressors might be more appropriate than lagged regressors, or whether further lags might be relevant).

    Leave a comment:


  • Mugi Jang
    replied
    Dear Sebastian Kripfganz

    I got the following results
    The results satisfy the autocorrelation test but not the hansen test and also does not satisfy any Sargan-Hansen (difference) test of the overidentifying restrictions
    You said in somewhere post that if observation is very large then the model is very sensitive to the misspecification of model.
    Then do I have to disregard this model and data or is there any remedy for the rejection of sagan-hansen test?
    Thanks lot in advance


    xtdpdgmm L(0/1).dlntfp L(1/1).dlnrnd L(1/1).dinvest L(1/1).dlnrcapital i.type ,
    > model(diff) gmmiv(L.dlntfp, lag(1 . ) ) gmmiv(L.dlnrnd, lag(1 . )) gmmiv(L.dinves
    > t, lag(1 . )) gmmiv(L.dlnrcapital,lag(1 . )) gmmiv(L.dlntfp, model(l)lag(0 0)diff
    > ) gmmiv(L.dlnrnd, model(l)lag(0 0)diff) gmmiv(L.dinvest, model(l)lag(0 0)diff) gm
    > miv(L.dlnrcapital,model(l)lag(0 0)diff) iv( i.type ,model(l) )vce(robust) teffects
    > overid two


    Group variable: firmid Number of obs = 29827
    Time variable: year Number of groups = 4267

    Moment conditions: linear = 118 Obs per group: min = 4
    nonlinear = 0 avg = 6.990157
    total = 118 max = 7

    (Std. err. adjusted for 4,267 clusters in firmid)
    ------------------------------------------------------------------------------
    | WC-Robust
    dlntfp | Coefficient std. err. z P>|z| [95% conf. interval]
    -------------+----------------------------------------------------------------
    dlntfp |
    L1. | .4948371 .0166118 29.79 0.000 .4622785 .5273956
    |
    dlnrnd |
    L1. | .004618 .0017585 2.63 0.009 .0011713 .0080647
    |
    dinvest |
    L1. | -3.41e-07 9.33e-06 -0.04 0.971 -.0000186 .000018
    |
    dlnrcapital |
    L1. | -.0173935 .0027628 -6.30 0.000 -.0228086 -.0119785
    |
    _Itype_2 | -.0070249 .0048173 -1.46 0.145 -.0164667 .0024168
    _Itype_3 | -.0506892 .0107092 -4.73 0.000 -.0716788 -.0296995
    _Itype_4 | -.0626915 .0113263 -5.54 0.000 -.0848906 -.0404924
    |
    year |
    2016 | -.0126684 .0015735 -8.05 0.000 -.0157524 -.0095843
    2017 | -.0137031 .0016652 -8.23 0.000 -.0169667 -.0104394
    2018 | -.0061274 .0015763 -3.89 0.000 -.0092169 -.0030379
    2019 | -.017485 .0016935 -10.32 0.000 -.0208042 -.0141659
    2020 | -.0159248 .0017539 -9.08 0.000 -.0193625 -.0124872
    2021 | -.0141207 .001876 -7.53 0.000 -.0177975 -.0104439
    |
    _cons | .0527505 .010136 5.20 0.000 .0328844 .0726167
    ------------------------------------------------------------------------------
    Instruments corresponding to the linear moment conditions:
    1, model(diff):
    2016:L1.L.dlntfp 2017:L1.L.dlntfp 2018:L1.L.dlntfp 2019:L1.L.dlntfp
    2020:L1.L.dlntfp 2021:L1.L.dlntfp 2017:L2.L.dlntfp 2018:L2.L.dlntfp
    2019:L2.L.dlntfp 2020:L2.L.dlntfp 2021:L2.L.dlntfp 2018:L3.L.dlntfp
    2019:L3.L.dlntfp 2020:L3.L.dlntfp 2021:L3.L.dlntfp 2019:L4.L.dlntfp
    2020:L4.L.dlntfp 2021:L4.L.dlntfp 2020:L5.L.dlntfp 2021:L5.L.dlntfp
    2021:L6.L.dlntfp
    2, model(diff):
    2016:L1.L.dlnrnd 2017:L1.L.dlnrnd 2018:L1.L.dlnrnd 2019:L1.L.dlnrnd
    2020:L1.L.dlnrnd 2021:L1.L.dlnrnd 2017:L2.L.dlnrnd 2018:L2.L.dlnrnd
    2019:L2.L.dlnrnd 2020:L2.L.dlnrnd 2021:L2.L.dlnrnd 2018:L3.L.dlnrnd
    2019:L3.L.dlnrnd 2020:L3.L.dlnrnd 2021:L3.L.dlnrnd 2019:L4.L.dlnrnd
    2020:L4.L.dlnrnd 2021:L4.L.dlnrnd 2020:L5.L.dlnrnd 2021:L5.L.dlnrnd
    2021:L6.L.dlnrnd
    3, model(diff):
    2016:L1.L.dinvest 2017:L1.L.dinvest 2018:L1.L.dinvest 2019:L1.L.dinvest
    2020:L1.L.dinvest 2021:L1.L.dinvest 2017:L2.L.dinvest 2018:L2.L.dinvest
    2019:L2.L.dinvest 2020:L2.L.dinvest 2021:L2.L.dinvest 2018:L3.L.dinvest
    2019:L3.L.dinvest 2020:L3.L.dinvest 2021:L3.L.dinvest 2019:L4.L.dinvest
    2020:L4.L.dinvest 2021:L4.L.dinvest 2020:L5.L.dinvest 2021:L5.L.dinvest
    2021:L6.L.dinvest
    4, model(diff):
    2016:L1.L.dlnrcapital 2017:L1.L.dlnrcapital 2018:L1.L.dlnrcapital
    2019:L1.L.dlnrcapital 2020:L1.L.dlnrcapital 2021:L1.L.dlnrcapital
    2017:L2.L.dlnrcapital 2018:L2.L.dlnrcapital 2019:L2.L.dlnrcapital
    2020:L2.L.dlnrcapital 2021:L2.L.dlnrcapital 2018:L3.L.dlnrcapital
    2019:L3.L.dlnrcapital 2020:L3.L.dlnrcapital 2021:L3.L.dlnrcapital
    2019:L4.L.dlnrcapital 2020:L4.L.dlnrcapital 2021:L4.L.dlnrcapital
    2020:L5.L.dlnrcapital 2021:L5.L.dlnrcapital 2021:L6.L.dlnrcapital
    5, model(level):
    2016:D.L.dlntfp 2017:D.L.dlntfp 2018:D.L.dlntfp 2019:D.L.dlntfp
    2020:D.L.dlntfp 2021:D.L.dlntfp
    6, model(level):
    2016:D.L.dlnrnd 2017:D.L.dlnrnd 2018:D.L.dlnrnd 2019:D.L.dlnrnd
    2020:D.L.dlnrnd 2021:D.L.dlnrnd
    7, model(level):
    2016:D.L.dinvest 2017:D.L.dinvest 2018:D.L.dinvest 2019:D.L.dinvest
    2020:D.L.dinvest 2021:D.L.dinvest
    8, model(level):
    2016:D.L.dlnrcapital 2017:D.L.dlnrcapital 2018:D.L.dlnrcapital
    2019:D.L.dlnrcapital 2020:D.L.dlnrcapital 2021:D.L.dlnrcapital
    9, model(level):
    _Itype_2 _Itype_3 _Itype_4
    10, model(level):
    2016bn.year 2017.year 2018.year 2019.year 2020.year 2021.year
    11, model(level):
    _cons

    . estat serial, ar(1/3)

    Arellano-Bond test for autocorrelation of the first-differenced residuals
    H0: no autocorrelation of order 1 z = -20.4888 Prob > |z| = 0.0000
    H0: no autocorrelation of order 2 z = 1.3926 Prob > |z| = 0.1638
    H0: no autocorrelation of order 3 z = 1.2584 Prob > |z| = 0.2083

    . estat overid

    Sargan-Hansen test of the overidentifying restrictions
    H0: overidentifying restrictions are valid

    2-step moment functions, 2-step weighting matrix chi2(104) = 279.9075
    Prob > chi2 = 0.0000

    2-step moment functions, 3-step weighting matrix chi2(104) = 277.7027
    Prob > chi2 = 0.0000

    . estat overid, difference

    Sargan-Hansen (difference) test of the overidentifying restrictions
    H0: (additional) overidentifying restrictions are valid

    2-step weighting matrix from full model

    | Excluding | Difference
    Moment conditions | chi2 df p | chi2 df p
    ------------------+-----------------------------+-----------------------------
    1, model(diff) | 146.8224 83 0.0000 | 133.0851 21 0.0000
    2, model(diff) | 239.7621 83 0.0000 | 40.1453 21 0.0071
    3, model(diff) | 258.7313 83 0.0000 | 21.1761 21 0.4482
    4, model(diff) | 224.2280 83 0.0000 | 55.6795 21 0.0001
    5, model(level) | 254.1427 98 0.0000 | 25.7648 6 0.0002
    6, model(level) | 265.6027 98 0.0000 | 14.3047 6 0.0264
    7, model(level) | 273.3105 98 0.0000 | 6.5969 6 0.3597
    8, model(level) | 266.5003 98 0.0000 | 13.4072 6 0.0370
    9, model(level) | 272.5903 101 0.0000 | 7.3171 3 0.0624
    10, model(level) | 271.3848 98 0.0000 | 8.5227 6 0.2023
    model(diff) | 75.4602 20 0.0000 | 204.4472 84 0.0000
    model(level) | 194.3691 71 0.0000 | 85.5384 33 0.0000



    .

    Leave a comment:


  • Sebastian Kripfganz
    replied
    The test statistic produced by your estat overid onelevel command is calculated as the difference of the individual Hansen test statistics from the two models, as you already observed. The right comparison in the output of estat overid, difference would be the very last row, where the difference test statistic is shown as 8.1042. These numbers do not coincide exactly, because for the estat overid, difference table the reduced model (without the level moment conditions) was computed using a restricted weighting matrix, taking the weighting matrix of the full model and leaving out the rows and columns for the level model. This weighting matrix differs from the one computed for the first model. The two test statistics are asymptotically equivalent, but not identical in finite samples.

    The benefit of using the restricted weighting matrix is that it does not require estimating both models, and that the test statistic is guaranteed to be non-negative. The advantage of comparing the Hansen tests from two separately estimated models is that it gives you full flexibility about which moment conditions you want to test without getting lost in the larger table produced by estat overid, difference.

    Leave a comment:


  • Mugi Jang
    replied
    Dear professor Sebastian Kripfganz
    You recommended me to try the manual difference-in-Hansen test outlined on slide 49 of your 2019 London Stata Conference presentation
    So I tried as follows

    *comparison of one and two level equation
    1.nested model
    xtdpdgmm L(0/1).n w k, model(diff) collapse gmm(n, lag(2 4)) gmm(w k, lag(1 3)) gmm(n, lag(1 1)diff model(level)) two vce(r) overid // gmm(w k, lag(0 0) diff model(level)) dropped
    estat overid
    estimates store onelevel

    2.nesting full model
    xtdpdgmm L(0/1).n w k, model(diff) collapse gmm(n, lag(2 4)) gmm(w k, lag(1 3)) gmm(n, lag(1 1) diff model(level)) gmm(w k, lag(0 0) diff model(level)) two vce(r) overid
    estat overid
    estat overid,difference
    estat overid onelevel

    From the 1.nested model the output is as follows
    1, model(diff):
    L2.n L3.n L4.n
    2, model(diff):
    L1.w L2.w L3.w L1.k L2.k L3.k
    3, model(level):
    L1.D.n

    . estat overid

    Sargan-Hansen test of the overidentifying restrictions
    H0: overidentifying restrictions are valid
    2-step moment functions, 2-step weighting matrix chi2(7) = 8.2384
    Prob > chi2 = 0.3120
    2-step moment functions, 3-step weighting matrix chi2(7) = 6.6284
    Prob > chi2 = 0.4686


    From the 2.nesting full model the output is as follows

    1, model(diff):
    L2.n L3.n L4.n
    2, model(diff):
    L1.w L2.w L3.w L1.k L2.k L3.k
    3, model(level):
    L1.D.n
    4, model(level):
    D.w D.k
    5, model(level):
    _cons

    . estat overid
    Sargan-Hansen test of the overidentifying restrictions
    H0: overidentifying restrictions are valid

    2-step moment functions, 2-step weighting matrix chi2(9) = 16.1962
    Prob > chi2 = 0.0629

    2-step moment functions, 3-step weighting matrix chi2(9) = 13.8077
    Prob > chi2 = 0.1293

    estat overid,difference
    Sargan-Hansen (difference) test of the overidentifying restrictions
    H0: (additional) overidentifying restrictions are valid

    2-step weighting matrix from full model

    | Excluding | Difference
    Moment conditions | chi2 df p | chi2 df p
    ------------------+-----------------------------+-----------------------------
    1, model(diff) | 14.6666 6 0.0230 | 1.5296 3 0.6754
    2, model(diff) | 4.0234 3 0.2590 | 12.1728 6 0.0582
    3, model(level) | 15.8404 8 0.0447 | 0.3558 1 0.5509
    4, model(level) | 12.0861 7 0.0978 | 4.1102 2 0.1281
    model(diff) | 0.0000 0 . | 16.1962 9 0.0629
    model(level) | 8.0920 6 0.2314 | 8.1042 3 0.0439


    Finally I utilized the xtdpdgmm postestimation command estat overid to compute the difference of two nested overidentification test statistics


    estat overid onelevel

    Sargan-Hansen difference test of the overidentifying restrictions
    H0: additional overidentifying restrictions are valid
    2-step moment functions, 2-step weighting matrix chi2(2) = 7.9578
    Prob > chi2 = 0.0187
    2-step moment functions, 3-step weighting matrix chi2(2) = 7.1793
    Prob > chi2 = 0.0276


    Here are my questions
    why does not the values of the last Sargan-Hansen difference test (right above)match to Sargan-Hansen (difference) test of the overidentifying restrictions of nesting model

    7.9578 != 4.1102

    But it matches to the difference value of the two model's Sargan-Hansen test of the overidentifying restrictions


    16.1962 - 8.2384 = 7.9578

    why I can't find this value(7.9578) in the 2.nesting full model's "| Excluding | Difference output" table
    and also after excluding the model 4 the chi2 value of nesting model is 12.0861 not 8.2384 in "| Excluding | Difference output" table

    Thanks lot in advance

    Leave a comment:


  • Sebastian Kripfganz
    replied
    Filip Novinc
    1. If the difference-in-Hansen test fails for one set of instruments, it might be reasonable to just drop this set of instruments and continue with all of the other instruments. You might want to test for potential underidentification problems after dropping a set of instruments using the underid postestimation command.
    2. Which tests you report depends on the message you want to conveye. A good strategy might be to report the overall Hansen test and a single difference-in-Hansen test for all level instruments jointly. If you want to be more specific about different sets of instruments, you can also report all of the individual circled difference-in-Hansen tests.
    3. (a) is a robustness test to potential problems with estimating the optimal weighting matrix; this could be useful. (b) is unclear to me what kind of robustness you are assessing; depends on your research question. (c) it is not sure what you will achieve with these comparisons. (d) it might be insightful to check the robustness to different lag lengths for the instruments.

    Mugi Jang
    1. Note that with model(fod), the contemporaneous variables (lag 0) already qualify as instruments (for exogenous and predetermined variables). The test results here look all fine.
    2. It is hard to say what is going on here. You might want to try keeping the instruments for the indicator variables for the first-differenced model, at least for testing purposes. You could also try the manual difference-in-Hansen test outlined on slide 49 of my 2019 London Stata Conference presentation; make sure that the smaller model is nested in the larger one!

    Leave a comment:


  • Mugi Jang
    replied
    Dear professor Kripfganz
    I am trying to measure the rate of return to R&D , so I build a model dyanmic panel data model
    dlntfp(firm performance) = L.dlntfp + L.dlnrnd(firm RnD investment) + L.dinvest(firm physical invest) + L.dlnrcapital(firm captal sotck) + i.type_num(firm tipe)
    I first implemented model(fod) collapse nolevel and second stage I added the level equation
    then at the second stage the excluding differenc hansen test value changed form insignificant to significant
    what does this phenomina happen?
    If the results of difference gmm does not satisfy anyone of test(Arellano-Bond test for autocorrelation of the first-differenced residuals, Sargan-Hansen test of the overidentifying restrictions ,Sargan-Hansen (difference) test of the overidentifying restrictions) then can't we move to sys-gmm.
    Thanks lot!



    1. only model(fod) not level equation

    . xi:xtdpdgmm L(0/1).dlntfp L(1/1).dlnrnd L(1/1).dinvest L(1/1).dlnrcapital i.type_num , model(fod
    > ) collapse iv( i.type ) gmmiv(L.dlntfp, lag(1 3 ) ) gmmiv(L.dlnrnd, lag(1 3 )) gmmiv(L.dinvest,
    > lag(1 3 )) gmmiv(L.dlnrcapital,lag(1 3 )) vce(robust) teffects overid two
    i.type_num _Itype_num_1-4 (naturally coded; _Itype_num_1 omitted)
    i.type _Itypea1-4 (_Itypea1 for type==대기업 omitted)

    Generalized method of moments estimation


    Group variable: firmid Number of obs = 5690
    Time variable: year Number of groups = 815

    Moment conditions: linear = 22 Obs per group: min = 4
    nonlinear = 0 avg = 6.981595
    total = 22 max = 7

    (Std. err. adjusted for 815 clusters in firmid)
    ------------------------------------------------------------------------------
    | WC-Robust
    dlntfp | Coefficient std. err. z P>|z| [95% conf. interval]
    -------------+----------------------------------------------------------------
    dlntfp |
    L1. | .2366397 .1872553 1.26 0.206 -.130374 .6036533
    |
    dlnrnd |
    L1. | -.0956772 .0571342 -1.67 0.094 -.2076581 .0163037
    |
    dinvest |
    L1. | .0007188 .0010524 0.68 0.495 -.0013438 .0027815
    |
    dlnrcapital |
    L1. | -.0430342 .0297595 -1.45 0.148 -.1013618 .0152934
    |
    _Itype_num_2 | .0423446 .1142619 0.37 0.711 -.1816045 .2662937
    _Itype_num_3 | .0406287 .0307536 1.32 0.186 -.0196472 .1009046
    _Itype_num_4 | .087836 .0908781 0.97 0.334 -.0902819 .2659538
    |
    year |
    2016 | -.0333049 .0093917 -3.55 0.000 -.0517122 -.0148976
    2017 | -.0495779 .0079012 -6.27 0.000 -.065064 -.0340919
    2018 | -.0027143 .0064672 -0.42 0.675 -.0153898 .0099612
    2019 | -.0495071 .0129707 -3.82 0.000 -.0749292 -.024085
    2020 | .0116514 .009511 1.23 0.221 -.0069897 .0302926
    2021 | -.054457 .0174896 -3.11 0.002 -.0887359 -.0201781
    |
    _cons | -.0319817 .1018606 -0.31 0.754 -.2316247 .1676614
    ------------------------------------------------------------------------------
    Instruments corresponding to the linear moment conditions:
    1, model(fodev):
    L1.L.dlntfp L2.L.dlntfp L3.L.dlntfp
    2, model(fodev):
    L1.L.dlnrnd L2.L.dlnrnd L3.L.dlnrnd
    3, model(fodev):
    L1.L.dinvest L2.L.dinvest L3.L.dinvest
    4, model(fodev):
    L1.L.dlnrcapital L2.L.dlnrcapital L3.L.dlnrcapital
    5, model(fodev):
    _Itypea2 _Itypea3 _Itypea4
    6, model(level):
    2016bn.year 2017.year 2018.year 2019.year 2020.year 2021.year
    7, model(level):
    _cons

    . estat serial, ar(1/3)

    Arellano-Bond test for autocorrelation of the first-differenced residuals
    H0: no autocorrelation of order 1 z = -2.6375 Prob > |z| = 0.0084
    H0: no autocorrelation of order 2 z = -0.5151 Prob > |z| = 0.6065
    H0: no autocorrelation of order 3 z = 0.4108 Prob > |z| = 0.6812

    . estat overid

    Sargan-Hansen test of the overidentifying restrictions
    H0: overidentifying restrictions are valid

    2-step moment functions, 2-step weighting matrix chi2(8) = 9.4444
    Prob > chi2 = 0.3062

    2-step moment functions, 3-step weighting matrix chi2(8) = 11.7550
    Prob > chi2 = 0.1625

    . estat overid, difference

    Sargan-Hansen (difference) test of the overidentifying restrictions
    H0: (additional) overidentifying restrictions are valid

    2-step weighting matrix from full model

    | Excluding | Difference
    Moment conditions | chi2 df p | chi2 df p
    ------------------+-----------------------------+-----------------------------
    1, model(fodev) | 6.0013 5 0.3061 | 3.4431 3 0.3282
    2, model(fodev) | 7.4374 5 0.1901 | 2.0070 3 0.5709
    3, model(fodev) | 6.1205 5 0.2947 | 3.3239 3 0.3443
    4, model(fodev) | 5.8152 5 0.3246 | 3.6292 3 0.3044
    5, model(fodev) | 8.3917 5 0.1359 | 1.0527 3 0.7885
    6, model(level) | 4.0922 2 0.1292 | 5.3523 6 0.4995
    model(fodev) | . -7 . | . . .




    2. adding sys-gmm

    . xtdpdgmm L(0/1).dlntfp L(1/1).dlnrnd L(1/1).dinvest L(1/1).dlnrcapital i.type_num , model(fod) collapse gmmiv(L.dlntfp, lag(1 . ) )
    >gmmiv(L.dlnrnd, lag(1 . )) gmmiv(L.dinvest, lag(1 . )) gmm
    > iv(L.dlnrcapital,lag(1 . )) gmmiv(L.dlntfp, model(l)lag(0 0) ) gmmiv(L.dlnrnd, model(l)lag(0 0))
    > gmmiv(L.dinvest, model(l)lag(0 0)) gmmiv(L.dlnrcapital,model(l)lag(0 0)) iv( i.type ,model(l) )
    > vce(robust) teffects overid
    i.type_num _Itype_num_1-4 (naturally coded; _Itype_num_1 omitted)
    i.type _Itypea1-4 (_Itypea1 for type==대기업 omitted)

    Generalized method of moments estimation



    Group variable: firmid Number of obs = 5690
    Time variable: year Number of groups = 815

    Moment conditions: linear = 34 Obs per group: min = 4
    nonlinear = 0 avg = 6.981595
    total = 34 max = 7

    (Std. err. adjusted for 815 clusters in firmid)
    ------------------------------------------------------------------------------
    | Robust
    dlntfp | Coefficient std. err. z P>|z| [95% conf. interval]
    -------------+----------------------------------------------------------------
    dlntfp |
    L1. | .6732203 .0246499 27.31 0.000 .6249074 .7215332
    |
    dlnrnd |
    L1. | .0140929 .0018071 7.80 0.000 .0105511 .0176346
    |
    dinvest |
    L1. | 5.76e-06 7.07e-06 0.82 0.415 -8.10e-06 .0000196
    |
    dlnrcapital |
    L1. | -.0123889 .0017524 -7.07 0.000 -.0158236 -.0089542
    |
    _Itype_num_2 | -.0007039 .0083043 -0.08 0.932 -.0169801 .0155723
    _Itype_num_3 | .0167453 .0080707 2.07 0.038 .000927 .0325635
    _Itype_num_4 | .0119466 .0081141 1.47 0.141 -.0039568 .0278499
    |
    year |
    2016 | -.0458018 .005804 -7.89 0.000 -.0571774 -.0344262
    2017 | -.0513295 .0055171 -9.30 0.000 -.0621428 -.0405162
    2018 | .0014926 .0048898 0.31 0.760 -.0080912 .0110764
    2019 | -.0680147 .0056909 -11.95 0.000 -.0791686 -.0568607
    2020 | .0193935 .0058651 3.31 0.001 .0078981 .0308889
    2021 | -.0765423 .0055467 -13.80 0.000 -.0874137 -.0656709
    |
    _cons | .0185271 .0081254 2.28 0.023 .0026016 .0344526
    ------------------------------------------------------------------------------
    Instruments corresponding to the linear moment conditions:
    1, model(fodev):
    L1.L.dlntfp L2.L.dlntfp L3.L.dlntfp L4.L.dlntfp L5.L.dlntfp
    2, model(fodev):
    L1.L.dlnrnd L2.L.dlnrnd L3.L.dlnrnd L4.L.dlnrnd L5.L.dlnrnd
    3, model(fodev):
    L1.L.dinvest L2.L.dinvest L3.L.dinvest L4.L.dinvest L5.L.dinvest
    4, model(fodev):
    L1.L.dlnrcapital L2.L.dlnrcapital L3.L.dlnrcapital L4.L.dlnrcapital
    L5.L.dlnrcapital
    5, model(level):
    L.dlntfp
    6, model(level):
    L.dlnrnd
    7, model(level):
    L.dinvest
    8, model(level):
    L.dlnrcapital
    9, model(level):
    _Itypea2 _Itypea3 _Itypea4
    10, model(level):
    2016bn.year 2017.year 2018.year 2019.year 2020.year 2021.year
    11, model(level):
    _cons

    . estat serial, ar(1/3)

    Arellano-Bond test for autocorrelation of the first-differenced residuals
    H0: no autocorrelation of order 1 z = -10.9716 Prob > |z| = 0.0000
    H0: no autocorrelation of order 2 z = 0.4605 Prob > |z| = 0.6451
    H0: no autocorrelation of order 3 z = 1.2161 Prob > |z| = 0.2239

    . estat overid

    Sargan-Hansen test of the overidentifying restrictions
    H0: overidentifying restrictions are valid

    1-step moment functions, 1-step weighting matrix chi2(20) = 68.5536
    note: * Prob > chi2 = 0.0000

    1-step moment functions, 2-step weighting matrix chi2(20) = 55.3697
    note: * Prob > chi2 = 0.0000

    * asymptotically invalid if the one-step weighting matrix is not optimal

    . estat overid, difference

    Sargan-Hansen (difference) test of the overidentifying restrictions
    H0: (additional) overidentifying restrictions are valid

    1-step weighting matrix from full model
    note: asymptotically invalid if the one-step weighting matrix is not optimal

    | Excluding | Difference
    Moment conditions | chi2 df p | chi2 df p
    ------------------+-----------------------------+-----------------------------
    1, model(fodev) | 56.1563 15 0.0000 | 12.3972 5 0.0297
    2, model(fodev) | 58.8869 15 0.0000 | 9.6666 5 0.0853
    3, model(fodev) | 53.7500 15 0.0000 | 14.8036 5 0.0112
    4, model(fodev) | 62.2399 15 0.0000 | 6.3137 5 0.2769
    5, model(level) | 55.4676 19 0.0000 | 13.0860 1 0.0003
    6, model(level) | 62.7131 19 0.0000 | 5.8405 1 0.0157
    7, model(level) | 61.7584 19 0.0000 | 6.7951 1 0.0091
    8, model(level) | 64.4376 19 0.0000 | 4.1159 1 0.0425
    9, model(level) | 59.0935 17 0.0000 | 9.4600 3 0.0238
    10, model(level) | 56.8654 14 0.0000 | 11.6881 6 0.0693
    model(fodev) | 0.3769 0 . | 68.1766 20 0.0000
    model(level) | 11.5289 7 0.1172 | 57.0247 13 0.0000

    Leave a comment:

Working...
X