Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Originally posted by Alex Mai View Post
    But in what situation or for what kind of variables, should I use -iv(x, eq(diff))-? You have argued that -iv(x, eq(level))- makes -iv(x, eq(diff))- asymptotically redundant.
    My comment mainly referred to the gmm() option that creates lagged levels as instruments for the first-differenced model and differences as instruments for the level model. Here, the latter do not turn the former redundant.

    Originally posted by Alex Mai View Post
    And can I understand the Hansen test for the first-differenced model (the very first part in the Difference-in-Hansen test) as the test for the validity of lagged dependent and endogenous variables as instruments for the first-differenced dependent and endogenous variables in the first-differenced equation?
    In short, yes. But keep in mind that, strictly speaking, the overidentification tests are not just test for the validity of instruments. If you reject the null hypothesis, this might be because your instruments are indeed invalid given that the model is otherwise correctly specified, or it might be that your model suffers from another form of misspecification.
    https://twitter.com/Kripfganz

    Comment


    • #32
      Originally posted by Sebastian Kripfganz View Post
      My comment mainly referred to the gmm() option that creates lagged levels as instruments for the first-differenced model and differences as instruments for the level model. Here, the latter do not turn the former redundant.
      • On the other side, as mentioned in several occasions, split instruments for the differenced model and those for the level model in separate groups.
      So do you mean that for -gmm()- it is also recommended to split into -gmm( eq(level))- and -gmm( eq(diff))-? What I learn from earlier posts is that eq(level) and eq(diff) should be separately set for -iv()-, but I am not sure about -gmm()-.

      Comment


      • #33
        Not necessarily. xtabond2 computes the respective Difference-in-Hansen tests even when you do not separate them. But separating the options helps to understand how the GMM estimator is actually constructed. (It becomes less of a black box.)
        https://twitter.com/Kripfganz

        Comment


        • #34
          Originally posted by Sebastian Kripfganz View Post
          Not necessarily. xtabond2 computes the respective Difference-in-Hansen tests even when you do not separate them. But separating the options helps to understand how the GMM estimator is actually constructed. (It becomes less of a black box.)
          Thanks a lot! Btw, do you think that xtabond2 is suitable for (dynamic) Linear Probability Model with binary dependent variable? As far as I know perhaps there is not stata command for dynamic logit model. I do see someone uses xtabond2 to estimate binary dependent in a dynamic situation.

          One problem with Linear Probability Model is the possibility of negative fitted value, but studies have shown that except extreme situations (e.g. probability like 99% or 1%) the odds ratios are almost linear function of probability, which supports the use of Linear Probability Model. But I am not sure if this also holds for System GMM.
          Last edited by Alex Mai; 17 Apr 2018, 13:10.

          Comment


          • #35
            A model is one thing, an estimator is another. There are pros and cons for a linear probability model, as you have mentioned. System GMM is then just the estimator. That said, there might be a higher risk that lagged levels are weak instruments for the differences of a binary variable (and vice versa) but this depends on the particular characteristics of the data.
            https://twitter.com/Kripfganz

            Comment


            • #36
              For example, if the Arellano-Bond AR(2) test does not reject the null hypothesis of no second-order serial correlation of the first-differenced errors, then you usually need not separately justify the lagged levels of the dependent variable as instruments for the first-differenced model. In contrast, the difference-in-Hansen test for the level instruments is informative because it helps to evaluate whether the Blundell-Bond mean stationarity assumption might be violated.
              Dear Sebastian,

              With respect to your advice quoted, can I understand in the way that a model is still valid and dynamic complete if AR(2) rejects the null but the Hansen test for lagged dependent variable in level as instruments for the first-differenced model does not reject the null?

              I encounter the following situation in which AR(2) rejects the null, but all Difference-in-Hansen tests do not reject the null. Following your previous suggestions, I have tried adding the second lag of the dependent variable as regressor, but it hardly improves the p-value of AR(2).

              Code:
              ------------------------------------------------------------------------------
              Arellano-Bond test for AR(1) in first differences: z =  -3.12  Pr > z =  0.002
              Arellano-Bond test for AR(2) in first differences: z =  -1.95  Pr > z =  0.051
              ------------------------------------------------------------------------------
              Sargan test of overid. restrictions: chi2(22)   =  80.86  Prob > chi2 =  0.000
                (Not robust, but not weakened by many instruments.)
              Hansen test of overid. restrictions: chi2(22)   =  24.89  Prob > chi2 =  0.302
                (Robust, but weakened by many instruments.)
              
              Difference-in-Hansen tests of exogeneity of instrument subsets:
                GMM instruments for levels
                  Hansen test excluding group:     chi2(14)   =  17.86  Prob > chi2 =  0.213
                  Difference (null H = exogenous): chi2(8)    =   7.03  Prob > chi2 =  0.534
                iv(x1, eq(level))
                  Hansen test excluding group:     chi2(21)   =  23.92  Prob > chi2 =  0.297
                  Difference (null H = exogenous): chi2(1)    =   0.98  Prob > chi2 =  0.323
                iv(x2 x3, eq(level))
                  Hansen test excluding group:     chi2(20)   =  21.83  Prob > chi2 =  0.350
                  Difference (null H = exogenous): chi2(2)    =   3.06  Prob > chi2 =  0.217
                iv(x4, eq(level))
                  Hansen test excluding group:     chi2(21)   =  22.48  Prob > chi2 =  0.372
                  Difference (null H = exogenous): chi2(1)    =   2.41  Prob > chi2 =  0.121
                iv(dummy_1 year4 year5 year6 year7 year8 year9 year10, eq(level))
                  Hansen test excluding group:     chi2(13)   =  18.02  Prob > chi2 =  0.157
                  Difference (null H = exogenous): chi2(9)    =   6.87  Prob > chi2 =  0.650
              Thank you!
              Last edited by Alex Mai; 20 Apr 2018, 16:04.

              Comment


              • #37
                In a very strict sense, the AR(2) test does not reject the null hypothesis at the 5% significance level. Of course, such a marginal non-rejection is hard to defend.

                If the AR(2) test rejects the null but the respective (Difference-in-)Hansen test does not reject the null, this would be a contradiction. However, such conflicting test results can generally happen. Remember that a correctly sized test would reject the null hypothesis at the 5% significance level still in 5% of the cases even though the null hypothesis is actually true. Conversely, a test may not reject the null hypothesis even though it is wrong.

                You can either try to find a more robust specification or you would have to make a judgement whether you trust the joint evidence of the two tests in favor of the null hypothesis.
                https://twitter.com/Kripfganz

                Comment


                • #38
                  Thank you so much once more! Since I prefer to use a unified specification of regressors for all regressions on a particular dependent variable, I think I will have to make a judgement on whether the conflicting evidence of AR(2) and Difference-in-Hansen supports the model validity.

                  Comment


                  • #39
                    I also have some questions about the "Hansen test excluding group" for "GMM instruments for levels." I fail to reject the null of the "Arellano-Bond test for AR(2) in first differences", which means the idiosyncratic error terms are not serially correlated. But the null of the "Hansen test excluding group" for "GMM instruments for levels" is rejected. Is it correct that this Hansen test evaluates the joint validity of the GMM instruments used for the level equation? What are the GMM instruments used for the level equation if the forward orthogonal deviation option is specified? When the null is rejected, should I drop the GMM instruments for the level equation and only use the GMM instruments for the transformed equation, i,e, adding eq(diff) in GMM( ) ? But when I do that, I reject the null of the Hansen test of overid restrictions. Please see the regression output below. Thanks for your help.

                    Original estimator:
                    . xtabond2 ln_avg_price_qtr_cust l.ln_avg_price_qtr_cust ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_cheese qtr_2 qtr_3 qtr_4, gmm(l.ln_avg_price_qtr_cust, lag(1 3)) iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_cheese qtr_2 qtr_3 qtr_4 ) twostep robust orthogonal small
                    Favoring space over speed. To switch, type or click on mata: mata set matafavor speed, perm.
                    Warning: Two-step estimated covariance matrix of moments is singular.
                    Using a generalized inverse to calculate optimal weighting matrix for two-step estimation.
                    Difference-in-Sargan/Hansen statistics may be negative.

                    Dynamic panel-data estimation, two-step system GMM

                    Group variable: rank_cheese Number of obs = 6594
                    Time variable : yq Number of groups = 200
                    Number of instruments = 196 Obs per group: min = 5
                    F(7, 199) = 621.12 avg = 32.97
                    Prob > F = 0.000 max = 49

                    Corrected
                    ln_avg_price_qtr_cust Coef. Std. Err. t P>t [95% Conf. Interval]

                    ln_avg_price_qtr_cust
                    L1. .8525751 .0302117 28.22 0.000 .7929989 .9121514

                    ln_qtr_act_milk_production -1.294157 .0899475 -14.39 0.000 -1.47153 -1.116785
                    ln_qtr_pers_disp_income 1.312546 .0885666 14.82 0.000 1.137897 1.487195
                    ln_qtr_cost_index_cheese .0443752 .0137362 3.23 0.001 .0172881 .0714623
                    qtr_2 .0930413 .0063891 14.56 0.000 .0804422 .1056405
                    qtr_3 .0633145 .0032028 19.77 0.000 .0569987 .0696303
                    qtr_4 .013382 .0041955 3.19 0.002 .0051086 .0216553
                    _cons 17.92212 1.447264 12.38 0.000 15.06818 20.77606

                    Instruments for orthogonal deviations equation
                    Standard
                    FOD.(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly
                    ln_qtr_cost_index_cheese qtr_2 qtr_3 qtr_4)
                    GMM-type (missing=0, separate instruments for each period unless collapsed)
                    L(1/3).L.ln_avg_price_qtr_cust
                    Instruments for levels equation
                    Standard
                    ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly
                    ln_qtr_cost_index_cheese qtr_2 qtr_3 qtr_4
                    _cons
                    GMM-type (missing=0, separate instruments for each period unless collapsed)
                    D.L.ln_avg_price_qtr_cust

                    Arellano-Bond test for AR(1) in first differences: z = -3.84 Pr > z = 0.000
                    Arellano-Bond test for AR(2) in first differences: z = 1.56 Pr > z = 0.118

                    Sargan test of overid. restrictions: chi2(188) =2200.52 Prob > chi2 = 0.000
                    (Not robust, but not weakened by many instruments.)
                    Hansen test of overid. restrictions: chi2(188) = 199.26 Prob > chi2 = 0.273
                    (Robust, but weakened by many instruments.)

                    Difference-in-Hansen tests of exogeneity of instrument subsets:
                    GMM instruments for levels
                    Hansen test excluding group: chi2(140) = 194.66 Prob > chi2 = 0.002
                    Difference (null H = exogenous): chi2(48) = 4.61 Prob > chi2 = 1.000
                    iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_cheese qtr_2 qtr_3 qtr_4)
                    Hansen test excluding group: chi2(182) = 198.93 Prob > chi2 = 0.185
                    Difference (null H = exogenous): chi2(6) = 0.33 Prob > chi2 = 0.999

                    Last edited by Pian Chen; 19 Dec 2018, 09:43.

                    Comment


                    • #40
                      The "Hansen text excluding group" for the "GMM instruments for levels" is evaluating the instruments for the transformed model only. Notice that the Hansen test is "weakened ny many instruments". In your case, 196 instruments are clearly too many relative to your sample size. In particular given your highly unbalanced panel, I recommend to use the collapse option.

                      Do you expect the option iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_cheese qtr_2 qtr_3 qtr_4) to generate separate instruments for the transformed and the level model? This is not what it is doing! You would have to specify them separately by using the equation() suboption. Moreover, using the iv() option together with the orthogonal option is potentially dangerous due to the way how the forward-orthogonal deviations are implemented in xtabond2. For more information, see:
                      XTDPDGMM: new Stata command for efficient GMM estimation of linear (dynamic) panel models with nonlinear moment conditions

                      https://twitter.com/Kripfganz

                      Comment


                      • #41
                        Thanks, Sebastian! The xtabond2 documentation is very confusing. I tried the following three codes (difference bolded) and got identical results. I do not quite understand the way xtabond2 generates instruments for the transformed model and the level model. Would you please shed some lights? I also have no idea about the potential danger caused by using iv() and the forward-orthogonal deviations. I guess it is probably quite technical, but would like to understand it more so that I am not using the command blindly.

                        xtabond2 ln_avg_price_qtr_cust l.ln_avg_price_qtr_cust ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, gmm(l.ln_avg_price_qtr_cust , lag(1 3)) iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4 ) twostep robust orthogonal small

                        xtabond2 ln_avg_price_qtr_cust l.ln_avg_price_qtr_cust ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, gmm(l.ln_avg_price_qtr_cust , lag(1 3)) iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, equation(both)) twostep robust orthogonal small

                        xtabond2 ln_avg_price_qtr_cust l.ln_avg_price_qtr_cust ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, gmm(l.ln_avg_price_qtr_cust, lag(1 3) equation(both)) iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, equation(both)) twostep robust orthogonal small


                        I have also tried your command and collapsed the gmm ivs. If I want to use gmmiv and standard iv in both the transformed and level models (system GMM), what should I do? I could not get the code to run if use "m(fodev|level)" or "m(fodev level)".

                        xtdpdgmm ln_avg_price_qtr_cust l.ln_avg_price_qtr_cust ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, gmmiv(l.ln_avg_price_qtr_cust, lag(1 3) collapse m(fodev)) iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, m(fodev)) twostep vce(robust)

                        xtdpdgmm ln_avg_price_qtr_cust l.ln_avg_price_qtr_cust ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, gmmiv(l.ln_avg_price_qtr_cust, lag(1 3) collapse m(level)) iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, m(level)) twostep vce(robust)

                        Thank you so much!

                        Comment


                        • #42
                          When using xtabond2, the suboption equation(both) is equivalent to not typing anything. But notice that it is not equivalent to the combination of the following:
                          iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, equation(diff))
                          iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, equation(level))

                          If you are surprised that this yields different results, then what you want to get is probably the version with separately specified instruments for the two equations.

                          With xtdpdgmm, you cannot specify both arguments at once, m(fodev level). Similar to the above two lines, you would need to jointly specify
                          iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, m(fodev))
                          iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, m(level))


                          That said, once you have specified those standard instruments for the model in levels, the corresponding instruments for the transformed model become redundant. This is not the case for GMM-type instruments if they are differenced for the model in levels (which is not done by default for standard instruments). Notice further that an underlying assumption of these standard instruments for the level model is that they are all uncorrelated with the idiosyncratic error term and the unit-specific error component (the "fixed effects"). This is a strong assumption which might be difficult to justify in some cases.

                          The issue with the implementation of forward-orthogonal deviations in xtabond2 is indeed quite technical. I cannot really say more about it than what I was saying in the post that you can reach by clicking on the link in my post #40 above.
                          https://twitter.com/Kripfganz

                          Comment


                          • #43
                            Originally posted by Alex Mai View Post

                            Thanks a lot! Btw, do you think that xtabond2 is suitable for (dynamic) Linear Probability Model with binary dependent variable? As far as I know perhaps there is not stata command for dynamic logit model. I do see someone uses xtabond2 to estimate binary dependent in a dynamic situation.

                            One problem with Linear Probability Model is the possibility of negative fitted value, but studies have shown that except extreme situations (e.g. probability like 99% or 1%) the odds ratios are almost linear function of probability, which supports the use of Linear Probability Model. But I am not sure if this also holds for System GMM.
                            Hi Alex,
                            Could you please provide us the references of the studies that used xtabond2 to estimate binary dependent variable?
                            In addition, please provide us the reference for the study regarding the second sentence I underlined.
                            Thank you an advance!
                            Netty

                            Comment


                            • #44
                              Dear,

                              I also have trouble regarding Hansen tests.

                              I am now doing several "experiments" with various instrumental variables. When I perform xtabond2 command with original data(variables), results are:

                              Arellano-Bond test for AR(1) in first differences: z = -2.10 Pr > z = 0.036
                              Arellano-Bond test for AR(2) in first differences: z = -0.98 Pr > z = 0.327
                              Hansen test of overid. restrictions: chi2(4) = 4.48 Prob > chi2 = 0.344
                              (Robust, but weakened by many instruments.)
                              Difference-in-Hansen tests of exogeneity of instrument subsets:
                              iv(lnfdi tdummyeu, eq(level))
                              Hansen test excluding group: chi2(2) = 1.47 Prob > chi2 = 0.480
                              Difference (null H = exogenous): chi2(2) = 3.02 Prob > chi2 = 0.221,

                              but, after introducing new instrumental variable(which is in logarithm), results are:

                              Arellano-Bond test for AR(1) in first differences: z = -1.85 Pr > z = 0.064
                              Arellano-Bond test for AR(2) in first differences: z = -0.91 Pr > z = 0.362
                              Hansen test of overid. restrictions: chi2(5) = 4.62 Prob > chi2 = 0.464
                              (Robust, but weakened by many instruments.)
                              Difference-in-Hansen tests of exogeneity of instrument subsets:
                              GMM instruments for levels
                              Hansen test excluding group: chi2(0) = 0.00 Prob > chi2 = .
                              Difference (null H = exogenous): chi2(5) = 4.62 Prob > chi2 = 0.464
                              iv(lnfdi tdummyeu ln_gov_right1, eq(level))
                              Hansen test excluding group: chi2(2) = 1.52 Prob > chi2 = 0.468
                              Difference (null H = exogenous): chi2(3) = 3.10 Prob > chi2 = 0.376.

                              Do I make any mistake? My doubt is formatted as bold.

                              Thank you.

                              Comment


                              • #45
                                Originally posted by Slaven Savic View Post
                                GMM instruments for levels
                                Hansen test excluding group: chi2(0) = 0.00 Prob > chi2 = .
                                This test has 0 degrees of freedom. This means that after excluding the instruments for the levels model, the estimator is just identified. There are no overidentifying restrictions anymore that could be tested. In other words, with your model specification, you cannot test the validity of all overidentifying restrictions resulting from the levels model jointly unless you add more instruments for the first-differenced model.

                                There is not necessarily anything wrong with that as long as you are happy to go with the untested assumption that these instruments are valid. Since the overall Hansen test is fine, that could be justified.
                                https://twitter.com/Kripfganz

                                Comment

                                Working...
                                X