Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Which test to see in Difference-in Hansen test, excluding or difference.

    Dear Statalists,

    I read from Roodman (2007) that one should report the difference-in-hansen test for the validity and exogeneity of subset of instruments (despite that many published studies do not report them). However, I am not sure which of the two sub tests under difference-in-hansen (Hansen Excluding and Difference) I should report. some papers report both of them, while some only report one.

    "Hansen Excluding Group" examines the validity of the model without the specified set of instruments (the set of instruments specified in each sub-heading, such as iv(x2 x3)), and the "Difference" test examines the validity of the specified set of instruments by computing the difference between the two Hansen J statistics with and without this set of instruments. Is this understanding correct?

    Fo instance,
    Code:
    Difference-in-Hansen tests of exogeneity of instrument subsets:
      GMM instruments for levels
        Hansen test excluding group:     chi2(4)    =   4.06  Prob > chi2 =  0.397
        Difference (null H = exogenous): chi2(2)    =   1.41  Prob > chi2 =  0.494
      gmm(y, collapse lag(2 4))
        Hansen test excluding group:     chi2(2)    =   4.33  Prob > chi2 =  0.115
        Difference (null H = exogenous): chi2(4)    =   1.14  Prob > chi2 =  0.887
      gmm(x1, collapse lag(2 5))
        Hansen test excluding group:     chi2(1)    =   0.02  Prob > chi2 =  0.884
        Difference (null H = exogenous): chi2(5)    =   5.45  Prob > chi2 =  0.363
      iv(x2 x3, eq(level))
        Hansen test excluding group:     chi2(4)    =   3.62  Prob > chi2 =  0.459
        Difference (null H = exogenous): chi2(2)    =   1.85  Prob > chi2 =  0.397

    Should I report both of the two sub tests or only the Difference test? and is it necessary to report all the four sets of difference-in-hansen tests (GMM instruments for levels, gmm (y), gmm(x1), and iv(x2 x3))?

    Thank you!
    Last edited by Alex Mai; 02 Apr 2018, 03:59.

  • Huaxin Wanglu
    replied
    Originally posted by Sebastian Kripfganz View Post
    (1) No, the GMM instruments for the levels require an additional assumption. Absence of serial correlation is not sufficient for their validity. An additional difference-in-Hansen test can provide useful insights.

    (2) The terminology I used in the quoted post might have been misleading. With "level instruments" I meant to refer to the instruments for the level equation.
    Thank you so much for the clarification.

    Leave a comment:


  • Sebastian Kripfganz
    replied
    (1) No, the GMM instruments for the levels require an additional assumption. Absence of serial correlation is not sufficient for their validity. An additional difference-in-Hansen test can provide useful insights.

    (2) The terminology I used in the quoted post might have been misleading. With "level instruments" I meant to refer to the instruments for the level equation.

    Leave a comment:


  • Huaxin Wanglu
    replied
    Originally posted by Sebastian Kripfganz View Post
    If the model without the additional instruments is correctly specified (i.e. the Hansen test excluding this group of instruments does not reject the null hypothesis), then the difference-in-Hansen test could be interpretated as a test for the validity of the additional instruments. In that regard, your understanding is correct.

    As to which test results to report, it really depends. You certainly want to report the Hansen test for the full model. On top of that, it makes sense to report difference-in-Hansen tests for particular instruments if their inclusion requires particular justification. For example, if the Arellano-Bond AR(2) test does not reject the null hypothesis of no second-order serial correlation of the first-differenced errors, then you usually need not separately justify the lagged levels of the dependent variable as instruments for the first-differenced model. In contrast, the difference-in-Hansen test for the level instruments is informative because it helps to evaluate whether the Blundell-Bond mean stationarity assumption might be violated.

    For example, you could report the Hansen test for the model with the instruments for the first-differenced model only, the Hansen test for the full model, and the respective difference-in-Hansen test. The Hansen test for the first-differenced model tells you something whether your model is dynamically complete (because this implies whether those instruments are valid). The difference-in-Hansen test, as mentioned before, tells you something about the mean stationarity condition needed for the validity of the level instruments. Taking these two test results at face value, the Hansen test for the full model would in principal be redundant but it is still reasonable to provide a complete picture.
    Sorry to bring up this post. When reviewing this content again, I am still confused about two things: (1). "if the Arellano-Bond AR(2) test does not reject the null hypothesis of no second-order serial correlation of the first-differenced errors, then you usually need not separately justify the lagged levels of the dependent variable as instruments for the first-differenced model", do you mean if the model passes AR(2) test, it is unnecessary to report "Hansen test excluding group" for "GMM instruments for levels"?; (2). "the difference-in-Hansen test for the level instruments" refers to gmm(beta, eq(level) lag(1 1)) OR gmm(beta, eq(diff) lag(2 3))? If I understand correctly, the level instruments are used in the first differenced model and the differenced instruments are used in the level model, so it should be the latter. But, if we're estimating a two-step system GMM, shouldn't the testing in the level equation be more important? I see your conference slides mention the joint mean stationarity under the system GMM, but I still hope to check with you if I understand which test should I report correctly.
    Last edited by Huaxin Wanglu; 30 Oct 2021, 17:19.

    Leave a comment:


  • Sebastian Kripfganz
    replied
    Originally posted by Slaven Savic View Post
    GMM instruments for levels
    Hansen test excluding group: chi2(0) = 0.00 Prob > chi2 = .
    This test has 0 degrees of freedom. This means that after excluding the instruments for the levels model, the estimator is just identified. There are no overidentifying restrictions anymore that could be tested. In other words, with your model specification, you cannot test the validity of all overidentifying restrictions resulting from the levels model jointly unless you add more instruments for the first-differenced model.

    There is not necessarily anything wrong with that as long as you are happy to go with the untested assumption that these instruments are valid. Since the overall Hansen test is fine, that could be justified.

    Leave a comment:


  • Slaven Savic
    replied
    Dear,

    I also have trouble regarding Hansen tests.

    I am now doing several "experiments" with various instrumental variables. When I perform xtabond2 command with original data(variables), results are:

    Arellano-Bond test for AR(1) in first differences: z = -2.10 Pr > z = 0.036
    Arellano-Bond test for AR(2) in first differences: z = -0.98 Pr > z = 0.327
    Hansen test of overid. restrictions: chi2(4) = 4.48 Prob > chi2 = 0.344
    (Robust, but weakened by many instruments.)
    Difference-in-Hansen tests of exogeneity of instrument subsets:
    iv(lnfdi tdummyeu, eq(level))
    Hansen test excluding group: chi2(2) = 1.47 Prob > chi2 = 0.480
    Difference (null H = exogenous): chi2(2) = 3.02 Prob > chi2 = 0.221,

    but, after introducing new instrumental variable(which is in logarithm), results are:

    Arellano-Bond test for AR(1) in first differences: z = -1.85 Pr > z = 0.064
    Arellano-Bond test for AR(2) in first differences: z = -0.91 Pr > z = 0.362
    Hansen test of overid. restrictions: chi2(5) = 4.62 Prob > chi2 = 0.464
    (Robust, but weakened by many instruments.)
    Difference-in-Hansen tests of exogeneity of instrument subsets:
    GMM instruments for levels
    Hansen test excluding group: chi2(0) = 0.00 Prob > chi2 = .
    Difference (null H = exogenous): chi2(5) = 4.62 Prob > chi2 = 0.464
    iv(lnfdi tdummyeu ln_gov_right1, eq(level))
    Hansen test excluding group: chi2(2) = 1.52 Prob > chi2 = 0.468
    Difference (null H = exogenous): chi2(3) = 3.10 Prob > chi2 = 0.376.

    Do I make any mistake? My doubt is formatted as bold.

    Thank you.

    Leave a comment:


  • Netty Drori
    replied
    Originally posted by Alex Mai View Post

    Thanks a lot! Btw, do you think that xtabond2 is suitable for (dynamic) Linear Probability Model with binary dependent variable? As far as I know perhaps there is not stata command for dynamic logit model. I do see someone uses xtabond2 to estimate binary dependent in a dynamic situation.

    One problem with Linear Probability Model is the possibility of negative fitted value, but studies have shown that except extreme situations (e.g. probability like 99% or 1%) the odds ratios are almost linear function of probability, which supports the use of Linear Probability Model. But I am not sure if this also holds for System GMM.
    Hi Alex,
    Could you please provide us the references of the studies that used xtabond2 to estimate binary dependent variable?
    In addition, please provide us the reference for the study regarding the second sentence I underlined.
    Thank you an advance!
    Netty

    Leave a comment:


  • Sebastian Kripfganz
    replied
    When using xtabond2, the suboption equation(both) is equivalent to not typing anything. But notice that it is not equivalent to the combination of the following:
    iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, equation(diff))
    iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, equation(level))

    If you are surprised that this yields different results, then what you want to get is probably the version with separately specified instruments for the two equations.

    With xtdpdgmm, you cannot specify both arguments at once, m(fodev level). Similar to the above two lines, you would need to jointly specify
    iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, m(fodev))
    iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, m(level))


    That said, once you have specified those standard instruments for the model in levels, the corresponding instruments for the transformed model become redundant. This is not the case for GMM-type instruments if they are differenced for the model in levels (which is not done by default for standard instruments). Notice further that an underlying assumption of these standard instruments for the level model is that they are all uncorrelated with the idiosyncratic error term and the unit-specific error component (the "fixed effects"). This is a strong assumption which might be difficult to justify in some cases.

    The issue with the implementation of forward-orthogonal deviations in xtabond2 is indeed quite technical. I cannot really say more about it than what I was saying in the post that you can reach by clicking on the link in my post #40 above.

    Leave a comment:


  • Pian Chen
    replied
    Thanks, Sebastian! The xtabond2 documentation is very confusing. I tried the following three codes (difference bolded) and got identical results. I do not quite understand the way xtabond2 generates instruments for the transformed model and the level model. Would you please shed some lights? I also have no idea about the potential danger caused by using iv() and the forward-orthogonal deviations. I guess it is probably quite technical, but would like to understand it more so that I am not using the command blindly.

    xtabond2 ln_avg_price_qtr_cust l.ln_avg_price_qtr_cust ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, gmm(l.ln_avg_price_qtr_cust , lag(1 3)) iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4 ) twostep robust orthogonal small

    xtabond2 ln_avg_price_qtr_cust l.ln_avg_price_qtr_cust ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, gmm(l.ln_avg_price_qtr_cust , lag(1 3)) iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, equation(both)) twostep robust orthogonal small

    xtabond2 ln_avg_price_qtr_cust l.ln_avg_price_qtr_cust ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, gmm(l.ln_avg_price_qtr_cust, lag(1 3) equation(both)) iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, equation(both)) twostep robust orthogonal small


    I have also tried your command and collapsed the gmm ivs. If I want to use gmmiv and standard iv in both the transformed and level models (system GMM), what should I do? I could not get the code to run if use "m(fodev|level)" or "m(fodev level)".

    xtdpdgmm ln_avg_price_qtr_cust l.ln_avg_price_qtr_cust ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, gmmiv(l.ln_avg_price_qtr_cust, lag(1 3) collapse m(fodev)) iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, m(fodev)) twostep vce(robust)

    xtdpdgmm ln_avg_price_qtr_cust l.ln_avg_price_qtr_cust ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, gmmiv(l.ln_avg_price_qtr_cust, lag(1 3) collapse m(level)) iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, m(level)) twostep vce(robust)

    Thank you so much!

    Leave a comment:


  • Sebastian Kripfganz
    replied
    The "Hansen text excluding group" for the "GMM instruments for levels" is evaluating the instruments for the transformed model only. Notice that the Hansen test is "weakened ny many instruments". In your case, 196 instruments are clearly too many relative to your sample size. In particular given your highly unbalanced panel, I recommend to use the collapse option.

    Do you expect the option iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_cheese qtr_2 qtr_3 qtr_4) to generate separate instruments for the transformed and the level model? This is not what it is doing! You would have to specify them separately by using the equation() suboption. Moreover, using the iv() option together with the orthogonal option is potentially dangerous due to the way how the forward-orthogonal deviations are implemented in xtabond2. For more information, see:
    XTDPDGMM: new Stata command for efficient GMM estimation of linear (dynamic) panel models with nonlinear moment conditions

    Leave a comment:


  • Pian Chen
    replied
    I also have some questions about the "Hansen test excluding group" for "GMM instruments for levels." I fail to reject the null of the "Arellano-Bond test for AR(2) in first differences", which means the idiosyncratic error terms are not serially correlated. But the null of the "Hansen test excluding group" for "GMM instruments for levels" is rejected. Is it correct that this Hansen test evaluates the joint validity of the GMM instruments used for the level equation? What are the GMM instruments used for the level equation if the forward orthogonal deviation option is specified? When the null is rejected, should I drop the GMM instruments for the level equation and only use the GMM instruments for the transformed equation, i,e, adding eq(diff) in GMM( ) ? But when I do that, I reject the null of the Hansen test of overid restrictions. Please see the regression output below. Thanks for your help.

    Original estimator:
    . xtabond2 ln_avg_price_qtr_cust l.ln_avg_price_qtr_cust ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_cheese qtr_2 qtr_3 qtr_4, gmm(l.ln_avg_price_qtr_cust, lag(1 3)) iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_cheese qtr_2 qtr_3 qtr_4 ) twostep robust orthogonal small
    Favoring space over speed. To switch, type or click on mata: mata set matafavor speed, perm.
    Warning: Two-step estimated covariance matrix of moments is singular.
    Using a generalized inverse to calculate optimal weighting matrix for two-step estimation.
    Difference-in-Sargan/Hansen statistics may be negative.

    Dynamic panel-data estimation, two-step system GMM

    Group variable: rank_cheese Number of obs = 6594
    Time variable : yq Number of groups = 200
    Number of instruments = 196 Obs per group: min = 5
    F(7, 199) = 621.12 avg = 32.97
    Prob > F = 0.000 max = 49

    Corrected
    ln_avg_price_qtr_cust Coef. Std. Err. t P>t [95% Conf. Interval]

    ln_avg_price_qtr_cust
    L1. .8525751 .0302117 28.22 0.000 .7929989 .9121514

    ln_qtr_act_milk_production -1.294157 .0899475 -14.39 0.000 -1.47153 -1.116785
    ln_qtr_pers_disp_income 1.312546 .0885666 14.82 0.000 1.137897 1.487195
    ln_qtr_cost_index_cheese .0443752 .0137362 3.23 0.001 .0172881 .0714623
    qtr_2 .0930413 .0063891 14.56 0.000 .0804422 .1056405
    qtr_3 .0633145 .0032028 19.77 0.000 .0569987 .0696303
    qtr_4 .013382 .0041955 3.19 0.002 .0051086 .0216553
    _cons 17.92212 1.447264 12.38 0.000 15.06818 20.77606

    Instruments for orthogonal deviations equation
    Standard
    FOD.(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly
    ln_qtr_cost_index_cheese qtr_2 qtr_3 qtr_4)
    GMM-type (missing=0, separate instruments for each period unless collapsed)
    L(1/3).L.ln_avg_price_qtr_cust
    Instruments for levels equation
    Standard
    ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly
    ln_qtr_cost_index_cheese qtr_2 qtr_3 qtr_4
    _cons
    GMM-type (missing=0, separate instruments for each period unless collapsed)
    D.L.ln_avg_price_qtr_cust

    Arellano-Bond test for AR(1) in first differences: z = -3.84 Pr > z = 0.000
    Arellano-Bond test for AR(2) in first differences: z = 1.56 Pr > z = 0.118

    Sargan test of overid. restrictions: chi2(188) =2200.52 Prob > chi2 = 0.000
    (Not robust, but not weakened by many instruments.)
    Hansen test of overid. restrictions: chi2(188) = 199.26 Prob > chi2 = 0.273
    (Robust, but weakened by many instruments.)

    Difference-in-Hansen tests of exogeneity of instrument subsets:
    GMM instruments for levels
    Hansen test excluding group: chi2(140) = 194.66 Prob > chi2 = 0.002
    Difference (null H = exogenous): chi2(48) = 4.61 Prob > chi2 = 1.000
    iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_cheese qtr_2 qtr_3 qtr_4)
    Hansen test excluding group: chi2(182) = 198.93 Prob > chi2 = 0.185
    Difference (null H = exogenous): chi2(6) = 0.33 Prob > chi2 = 0.999

    Last edited by Pian Chen; 19 Dec 2018, 09:43.

    Leave a comment:


  • Alex Mai
    replied
    Thank you so much once more! Since I prefer to use a unified specification of regressors for all regressions on a particular dependent variable, I think I will have to make a judgement on whether the conflicting evidence of AR(2) and Difference-in-Hansen supports the model validity.

    Leave a comment:


  • Sebastian Kripfganz
    replied
    In a very strict sense, the AR(2) test does not reject the null hypothesis at the 5% significance level. Of course, such a marginal non-rejection is hard to defend.

    If the AR(2) test rejects the null but the respective (Difference-in-)Hansen test does not reject the null, this would be a contradiction. However, such conflicting test results can generally happen. Remember that a correctly sized test would reject the null hypothesis at the 5% significance level still in 5% of the cases even though the null hypothesis is actually true. Conversely, a test may not reject the null hypothesis even though it is wrong.

    You can either try to find a more robust specification or you would have to make a judgement whether you trust the joint evidence of the two tests in favor of the null hypothesis.

    Leave a comment:


  • Alex Mai
    replied
    For example, if the Arellano-Bond AR(2) test does not reject the null hypothesis of no second-order serial correlation of the first-differenced errors, then you usually need not separately justify the lagged levels of the dependent variable as instruments for the first-differenced model. In contrast, the difference-in-Hansen test for the level instruments is informative because it helps to evaluate whether the Blundell-Bond mean stationarity assumption might be violated.
    Dear Sebastian,

    With respect to your advice quoted, can I understand in the way that a model is still valid and dynamic complete if AR(2) rejects the null but the Hansen test for lagged dependent variable in level as instruments for the first-differenced model does not reject the null?

    I encounter the following situation in which AR(2) rejects the null, but all Difference-in-Hansen tests do not reject the null. Following your previous suggestions, I have tried adding the second lag of the dependent variable as regressor, but it hardly improves the p-value of AR(2).

    Code:
    ------------------------------------------------------------------------------
    Arellano-Bond test for AR(1) in first differences: z =  -3.12  Pr > z =  0.002
    Arellano-Bond test for AR(2) in first differences: z =  -1.95  Pr > z =  0.051
    ------------------------------------------------------------------------------
    Sargan test of overid. restrictions: chi2(22)   =  80.86  Prob > chi2 =  0.000
      (Not robust, but not weakened by many instruments.)
    Hansen test of overid. restrictions: chi2(22)   =  24.89  Prob > chi2 =  0.302
      (Robust, but weakened by many instruments.)
    
    Difference-in-Hansen tests of exogeneity of instrument subsets:
      GMM instruments for levels
        Hansen test excluding group:     chi2(14)   =  17.86  Prob > chi2 =  0.213
        Difference (null H = exogenous): chi2(8)    =   7.03  Prob > chi2 =  0.534
      iv(x1, eq(level))
        Hansen test excluding group:     chi2(21)   =  23.92  Prob > chi2 =  0.297
        Difference (null H = exogenous): chi2(1)    =   0.98  Prob > chi2 =  0.323
      iv(x2 x3, eq(level))
        Hansen test excluding group:     chi2(20)   =  21.83  Prob > chi2 =  0.350
        Difference (null H = exogenous): chi2(2)    =   3.06  Prob > chi2 =  0.217
      iv(x4, eq(level))
        Hansen test excluding group:     chi2(21)   =  22.48  Prob > chi2 =  0.372
        Difference (null H = exogenous): chi2(1)    =   2.41  Prob > chi2 =  0.121
      iv(dummy_1 year4 year5 year6 year7 year8 year9 year10, eq(level))
        Hansen test excluding group:     chi2(13)   =  18.02  Prob > chi2 =  0.157
        Difference (null H = exogenous): chi2(9)    =   6.87  Prob > chi2 =  0.650
    Thank you!
    Last edited by Alex Mai; 20 Apr 2018, 16:04.

    Leave a comment:


  • Sebastian Kripfganz
    replied
    A model is one thing, an estimator is another. There are pros and cons for a linear probability model, as you have mentioned. System GMM is then just the estimator. That said, there might be a higher risk that lagged levels are weak instruments for the differences of a binary variable (and vice versa) but this depends on the particular characteristics of the data.

    Leave a comment:

Working...
X