Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Sebastian Kripfganz
    replied
    1. The Blundell/Bond system GMM estimator extends the Arellano/Bond difference GMM estimator by adding further moment conditions (i.e. instruments). If some of the instruments for the difference GMM estimator are invalid, they will still be invalid if you add further instruments. With xtdpdgmm you could use the overid option and then the estat overid, difference postestimation command after the system GMM estimation. The last line in the test output that starts with model(level) can be used to make the desired assessment. If the test in the column headed "Excluded" does not reject the null hypothesis, then the difference GMM estimator is fine and you can use the column headed "Difference" to test the additional instruments used for the system GMM estimator. If the test in column headed "Excluded" rejects the null hypothesis, then the difference GMM estimator is misspecified and the corresponding "Difference" test becomes useless.
    2. Given homoskedasticity and no serial correlation of the idiosyncratic error term \(e_{it}\), this is a simple algebraic relationship: \(Corr(\Delta e_{it}, \Delta e_{i,t-1}) = Corr(e_{it}-e_{i,t-1}, e_{i,t-1}-e_{i,t-2}) = -Var(e_{it}) / Var(\Delta e_{it}) = -Var(e_{it}) / (2 Var(e_{it})) = -1/2\). Similarly, all higher-order correlations are zero because of the non-overlapping time periods in the numerator.
    3. There is no mapping of specific instruments to specific regressors. All instruments instrument all regressors. It is reasonable to believe that lags of a specific regressor have particularly strong predictive power for that specific regressor but that does not exclude the possibility that they may also have predictive power for other regressors. In fact, if a regressor is a predictor of the dependent variable, then it is reasonable to believe that the lags of such a regressor are also good predictors for the lagged dependent variable.
    4. If you assume that a variable is endogenous, you could use lags(2 .) as instruments if the model is correctly specified. If the difference-in-Hansen test rejects those instruments, then this is evidence that there is still some misspecification present. This could be omitted variables such as omitted dynamics in the form of lags of the regressors, or omitted interaction terms.
    5. In the terminology of (strictly) exogenous, predetermined, and endogenous regressors, all instruments (lags) that are valid for a predetermined variable are also valid for a strictly exogenous variable, but not the other way round.
    6. You want to start your specification search with a model that is correctly specified such that the estimation is consistent (although possibly inefficient). Otherwise, your difference-in-Hansen test might compare two misspecified models with each other which would not be a meaningful comparison; see point 1 above. The more lags of the regressors you include in the regressor list, the less likely it is that there will still be serial correlation in the error term which might invalidate some of the instruments.
    7. This is a suggestion for a model specification algorithm. Essentially the idea is to start with a possibly overspecified model (that yields consistent estimation) and then to remove some of the lagged regressors if their coefficients are statistically insignificant and the model specification tests still not reject the model after you removed those regressors. Jan Kiviet promotes a conservative view on the use of p-values, i.e. to use p-values as threshold that are much higher than 0.05 to make sure that you are on the safe side.
    8. Instead of just testing for the significance of a single coefficient, you could also use joint significance tests for multiple coefficients in your specification search.
    9. I would say that there are at least 2 situations where a one-step estimator is justified: (i) if you are using the difference GMM estimator with the added homoskedasticity assumption such that the one-step weighting matrix is already optimal (which is strong assumption and instead of imposing it you might just run the two-step estimator to let the data speak for itself); (ii) if your estimation sample is relatively small because the efficient estimation of the optimal weighting matrix requires a large number of groups. Both the one-step and the two-step estimator are consistent estimators but in general the two-step estimator is efficient while the one-step estimator may not be efficient. However, keep in mind that efficiency is an asymptotic concent. When your sample is very small, the finite-sample properties might be very different and the estimation of the optimal weighting matrix might lack robustness.

    Leave a comment:


  • Prateek Bedi
    replied
    Hi,

    In order to estimate dynamic panels accurately, I read the paper titled "Microeconometric dynamic panel data methods: Model specification and selection issues" by Jan F. Kiviet. Concerning this paper, I have the following doubts:

    1. The author repeatedly reiterates in his paper that as long as Arellano–Bond results are unsatisfactory, applying Blundell–Bond does not make sense. So how does one make a choice between Arellano-Bond's difference GMM and Blundell-Bond system GMM? Is there a criteria for the same? In this regard, the author also talks about the concept of effect-stationarity and effect non-stationarity. What do these concepts imply?

    2. The author states: "When the errors of the level equation are serially uncorrelated indeed, those of the first-differenced equation have negative first-order serial correlation of moving average form, with a first-order serial correlation coefficient −0.5 and zero second and higher-order serial correlation coefficients". How is this exact figure of -0.5 derived? Also, how is the author so sure about zero second and higher-order serial correlation coefficients? Is there a mathematical proof for the same?

    3. The author states: "lags of exogenous regressors will establish strong and valid instruments for any non-exogenous regressors, especially for regressors affected by immediate or lagged feedbacks from the dependent variable, in particular the lagged dependent regressor variables themselves." However, I thought a particular variable's lags/lead can serve as instruments for the same variable only. How come lags of exogenous variables serve as valid instruments of non-exogenous regressors?

    4. The author states: "Anyhow, if at least twice lagged regressors turn out to be invalid instruments this implies that the regression equation has not yet been specified adequately and requires additional explanatories". I could not understand author's point here. Is he saying that if lag(1 2) turn out to be invalid instrument (as indicated by the difference-in-Hansen test), we should include more lags of the variable as regressors in the model?

    5. The author states: "an exogenous regressor is predetermined, but a predetermined regressors is usually not exogenous". I could not understand how an exogenous regressor is predetermined?

    6. The author writes: "This finding instigates to start our model specification search by including at least one lag of all regressors, because validity of internal instruments constructed from lagged not strictly exogenous regressors requires white-noise disturbances, and obtaining white noise disturbances is promoted by using sufficiently large orders of all lag polynomials." So should we include at least one lag of all independent variables as regressor?

    7. The author states: "one could move on to stage 4, or first verify whether any of the coefficients for the longest lag of a variable x(m) or of yi,t has a t-value below 0.5, say, or a p-value above 0.6 or 0.7, say. If so, impose the least significant one of them to be zero, re-estimate the model, and repeat the same procedure until the coefficients of all longest lags have absolute t -values (well) above 0.5, and the m1, m2, J and incJ tests still produce satisfactory results." I could not understand what the author is trying to convey here.

    8. The author states: "Useful additional evidence can be produced by also testing the joint significance of groups of single coefficient restrictions already imposed on the MSM and verifying whether the p -value is high indeed. Such joint significance tests can also be obtained by using the “test” option." Again, I could not understand author's viewpoint here.

    9. I suppose we should always use the two-step estimator. Is this correct?

    Thanks and Regards

    Leave a comment:


  • Edgar Kausel
    replied
    Sebastian Kripfganz

    That was it. I was using version 2.2.0. Thanks!

    Ed

    Leave a comment:


  • Sebastian Kripfganz
    replied
    r(2000) is a "no observations" error. One reason might be that you did not properly xtset your data. For example, if your time periods are more than 1 time unit (e.g. year) apart, then you need to specify this with the delta() option of xtset. Another reason might be that you have many gaps (missing values) in your data set such that you do not have 3 consecutive time periods. Can you share with us the output from the following command?
    Code:
    xtdescribe

    Leave a comment:


  • Prashant Gupta
    replied
    Hi Sebastian and Statalisters,

    I am using xtdpdgmm command to run system gmm but I am getting this error r(2000). It says that "You have requested some statistical calculation and there are no observations on which to perform it. Perhaps you specified if or in and inadvertently filtered all the data."

    N is 45 and T is 10.

    The command is xtdpdgmm dv l.dv iv1 iv2 iv3 iv4 , twostep vce(cluster id) teffects gmmiv(l.dv, lag(1 2) collapse model(fodev)) gmmiv(iv1 , lag(1 2) collapse model(fodev)) gmmiv(iv2 , lag(1 2) collapse model(fodev)) gmmiv(iv3 , lag(1 2) collapse model(fodev)) gmmiv(iv4 , lag(0 0) collapse model(level)) nofootnote

    Please help.

    Leave a comment:


  • Sebastian Kripfganz
    replied
    Edgar Kausel
    There was a bug in estat serial that (I thought) I fixed with the latest update. Could you please tell me which version of xtdpdgmm you are using? You can find your version by typing the following in Stata's command window:
    Code:
    which xtdpdgmm
    If you do not have version 2.2.7, please update to the latest version which should hopefully solve your problem:
    Code:
    adoupdate xtdpdgmm, update

    Leave a comment:


  • Edgar Kausel
    replied
    Hi Sebastian,
    I'm having problems when conducting the Arellano-Bond test for autocorrelation.

    First, I go with:

    Code:
    xtdpdgmm lead_zjsat_6items  L.lead_jsat_6items i.lead_vol1##c.log_leaving i.wave , gmmiv(L.lead_jsat_6items , collapse) iv(i.wave) vce(robust) overid
    
    Group variable: id                           Number of obs         =     91850
    Time variable: wave                          Number of groups      =     15287
    
    Moment conditions:     linear =      28      Obs per group:    min =         1
                        nonlinear =       0                        avg =  6.008373
                            total =      28                        max =        14
    
                                        (Std. Err. adjusted for 15,287 clusters in id)
    ----------------------------------------------------------------------------------
                     |               Robust
    lead_zjsat_6it~s |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -----------------+----------------------------------------------------------------
    lead_jsat_6items |
                 L1. |   .3542155   .1051563     3.37   0.001     .1481129    .5603181
                     |
         1.lead_vol1 |  -7.993102   4.104824    -1.95   0.052    -16.03841    .0522057
         log_leaving |  -.7183377    .343703    -2.09   0.037    -1.391983   -.0446922
                     |
           lead_vol1#|
       c.log_leaving |
                  1  |   3.885735   .7894465     4.92   0.000     2.338448    5.433021
                     |
                wave |
                  1  |          0  (empty)
                  2  |   .0678619    .049292     1.38   0.169    -.0287486    .1644724
                  3  |   .0074792   .0487522     0.15   0.878    -.0880733    .1030317
                  4  |          0  (omitted)
                  5  |   .0847805   .0469901     1.80   0.071    -.0073183    .1768793
                  6  |   .0679861   .0464155     1.46   0.143    -.0229866    .1589588
                  7  |   .0552715   .0479401     1.15   0.249    -.0386893    .1492324
                  8  |   .1307863   .1192254     1.10   0.273    -.1028912    .3644638
                  9  |   .0752563    .076418     0.98   0.325    -.0745202    .2250328
                 10  |   .0612896   .0672975     0.91   0.362    -.0706111    .1931904
                 11  |   .0365584   .0620174     0.59   0.556    -.0849935    .1581103
                 12  |    .102438   .0816722     1.25   0.210    -.0576366    .2625126
                 13  |   .1125361   .0718052     1.57   0.117    -.0281995    .2532717
                 14  |   .0727399   .0759155     0.96   0.338    -.0760518    .2215315
                 15  |   .1186052   .0606086     1.96   0.050    -.0001854    .2373958
                 16  |          0  (empty)
                     |
               _cons |   -1.87803   1.056312    -1.78   0.075    -3.948363    .1923031
    ----------------------------------------------------------------------------------
    Instruments corresponding to the linear moment conditions:
     1, model(level):
       L.lead_jsat_6items L1.L.lead_jsat_6items L2.L.lead_jsat_6items
       L3.L.lead_jsat_6items L4.L.lead_jsat_6items L5.L.lead_jsat_6items
       L6.L.lead_jsat_6items L7.L.lead_jsat_6items L8.L.lead_jsat_6items
       L9.L.lead_jsat_6items L10.L.lead_jsat_6items L11.L.lead_jsat_6items
       L12.L.lead_jsat_6items L13.L.lead_jsat_6items
     2, model(level):
       3bn.wave 4.wave 5.wave 6.wave 7.wave 8.wave 9.wave 10.wave 11.wave 12.wave
       13.wave 14.wave 15.wave
     3, model(level):
       _cons
    But when I try the test I get:

    Code:
    estat serial, ar(1/3)
    
    Arellano-Bond test for autocorrelation of the first-differenced residuals
    D.0b:  operator invalid
    r(198);
    Am I doing something wrong?

    Thanks.
    Ed


    Leave a comment:


  • Sebastian Kripfganz
    replied
    The R² for IV/2SLS/GMM regressions is of limited to no use. See for example the following Stata FAQ:
    https://www.stata.com/support/faqs/s...least-squares/
    the R2 really has no statistical meaning in the context of 2SLS/IV
    For the random-effects model, please see the Remarks and Examples section in the Stata Manual entry for xtreg.

    Leave a comment:


  • Prateek Bedi
    replied
    Hi,

    I would like to understand why R-Squared is not calculated/reported in:

    1. GMM regressions?
    2. System-GMM regressions?
    3. Dynamic Panel Regressions?

    Further, is it meaningful to interpret R-Squared in instrument-variable regressions such as that reported by ivreg2? Also, is it meaningful to interpret R-Squared in random effects model as that reported by xtreg, re?

    Thanks!

    Leave a comment:


  • Sebastian Kripfganz
    replied
    Whether the coefficient of the lagged dependent variable is statistically significant or not, should usually not be an indicator of whether to accept the model. Otherwise, such an approach would come close to p-hacking.

    Among your three models, only the third would raise immediate concerns based on the given information. If there is higher-order serial correlation as indicated by the Arellano-Bond test, this would cause some of the instruments to be invalid. This could possibly be addressed by adding further lags of the dependent variable and the regressors to the model as regressors (not instruments).

    I would recommend to have a look at the section on Model Selection in my 2019 London Stata Conference presentation:

    Leave a comment:


  • Muhammad Ahmad
    replied
    Thank you so much for your reply. Further, I would like to ask another important question;

    How much the significance of the dependent variable is important to accept the model?

    In another study, I am estimating 3 models (dependent variables have 3 proxies) with GMM and facing the problem.
    1. 1st model (lag of dependent variable (Sig) + Sarga-Hansen (Insig) + Arellano-Bond serial correlation (Insig))
    2. 2nd model (lag of dependent variable (Insig) + Sarga-Hansen (Insig) + Arellano-Bond serial correlation (Insig))
    3. 3rd model (lag of dependent variable (Sig) + Sarga-Hansen (Sig) + Arellano-Bond serial correlation (Sig))

    Though estat endog proves that endogenity exists between variables. I tried to overcome the issue by increasing/ decreasing the lag of the dependent/ independent variable but the problem still persists. Please guide should I change my estimation methods?

    Leave a comment:


  • Sebastian Kripfganz
    replied
    Your specification and results generally appear all right, assuming that your independent variables are treated as predetermined. If they were endogenous, you should use instruments starting from lag 2 only. If they were strictly exogenous, you could even use lag 0 as an instrument.

    Another commonly applied specification test is the Arellano-Bond serial correlation test, estat serial. There, you would want the AR(1) test to reject the null hypothesis and the AR(2), AR(3), ... tests to not reject the null hypothesis.

    Leave a comment:


  • Muhammad Ahmad
    replied
    Dear Sebastian,
    I am estimating my research model through two-step differenced GMM. I am a research student from the field of finance not well known with Econometrics.
    First, I was trying to estimate my model through xtabond2 and then came to know that xtdpdgmm provides robust results than xtabond2.
    ACP- dependent variable
    APP, CFV, LEV, CH, TAT, FS, SG are independent and control variables
    I am using below command;
    Code:
    xtdpdgmm L(0/1).ACP APP CFV LEV CH TAT FS SG y_1-y_17, model(diff) collapse gmm(ACP, lag(2 4)) gmm( 
    > CFV LEV CH TAT FS SG , lag(1 3)) iv(y_1-y_17, diff) two vce(r)
    below are results

    Click image for larger version

Name:	Stata Results1.JPG
Views:	1
Size:	177.3 KB
ID:	1565504


    below are results for
    Code:
    estat overid
    Click image for larger version

Name:	Stata Results2.JPG
Views:	1
Size:	32.3 KB
ID:	1565505
    Please guide me is it the right command and results are good to interprete? As per my knowledge, results should must show significant lag of dependent variable and insignificant results for hansan-sargan tests. Please guide me Thank you

    Leave a comment:


  • Sebastian Kripfganz
    replied
    A new update of xtdpdgmm to version 2.2.7 is now available both on my own website and on SSC (thanks to Kit Baum):
    Code:
    adoupdate xtdpdgmm, update
    Besides some minor bug fixes and improvements under the hood, I reorganized and expanded the Remarks section of the help file, in part to address the feedback from Joseph L. Staats.

    For those, who missed the announcement: Mark Schaffer's new underid and overid commands for underidentification and overidentification tests are now on SSC as well. Both of them work as postestimation commands after xtdpdgmm as demonstrated (for underid) in my 2019 London Stata Conference presentation.

    Leave a comment:


  • Joseph L. Staats
    replied
    Sebastian,

    Once again, thank you for your reply.

    To close out this conversation, I wish to give you a shout-out for your fine work in creating and updating xtdpdgmm. I really enjoy using the program and especially appreciate the flexibility and wide range of options it provides. Thanks also for the many contributions you have made to the Statalist. I have read every one of them and have gained such a better knowledge of GMM because of them.

    Leave a comment:

Working...
X