Which test to see in Difference-in Hansen test, excluding or difference.

Sebastian Kripfganz

Join Date: May 2014

Posts: 2594
#31

17 Apr 2018, 07:05

Originally posted by Alex Mai View Post

But in what situation or for what kind of variables, should I use -iv(x, eq(diff))-? You have argued that -iv(x, eq(level))- makes -iv(x, eq(diff))- asymptotically redundant.

My comment mainly referred to the gmm() option that creates lagged levels as instruments for the first-differenced model and differences as instruments for the level model. Here, the latter do not turn the former redundant.

Originally posted by Alex Mai View Post

And can I understand the Hansen test for the first-differenced model (the very first part in the Difference-in-Hansen test) as the test for the validity of lagged dependent and endogenous variables as instruments for the first-differenced dependent and endogenous variables in the first-differenced equation?

In short, yes. But keep in mind that, strictly speaking, the overidentification tests are not just test for the validity of instruments. If you reject the null hypothesis, this might be because your instruments are indeed invalid given that the model is otherwise correctly specified, or it might be that your model suffers from another form of misspecification.

https://www.kripfganz.de/stata/
Comment
Alex Mai

Join Date: May 2016

Posts: 213
#32

17 Apr 2018, 10:33

Originally posted by Sebastian Kripfganz View Post

My comment mainly referred to the gmm() option that creates lagged levels as instruments for the first-differenced model and differences as instruments for the level model. Here, the latter do not turn the former redundant.

On the other side, as mentioned in several occasions, split instruments for the differenced model and those for the level model in separate groups.

So do you mean that for -gmm()- it is also recommended to split into -gmm( eq(level))- and -gmm( eq(diff))-? What I learn from earlier posts is that eq(level) and eq(diff) should be separately set for -iv()-, but I am not sure about -gmm()-.
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2594
#33

17 Apr 2018, 11:29

Not necessarily. xtabond2 computes the respective Difference-in-Hansen tests even when you do not separate them. But separating the options helps to understand how the GMM estimator is actually constructed. (It becomes less of a black box.)

https://www.kripfganz.de/stata/
Comment
Alex Mai

Join Date: May 2016

Posts: 213
#34

17 Apr 2018, 13:00

Originally posted by Sebastian Kripfganz View Post

Not necessarily. xtabond2 computes the respective Difference-in-Hansen tests even when you do not separate them. But separating the options helps to understand how the GMM estimator is actually constructed. (It becomes less of a black box.)

Thanks a lot! Btw, do you think that xtabond2 is suitable for (dynamic) Linear Probability Model with binary dependent variable? As far as I know perhaps there is not stata command for dynamic logit model. I do see someone uses xtabond2 to estimate binary dependent in a dynamic situation.

One problem with Linear Probability Model is the possibility of negative fitted value, but studies have shown that except extreme situations (e.g. probability like 99% or 1%) the odds ratios are almost linear function of probability, which supports the use of Linear Probability Model. But I am not sure if this also holds for System GMM.

Last edited by Alex Mai; 17 Apr 2018, 13:10.
1 like
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2594
#35

17 Apr 2018, 15:43

A model is one thing, an estimator is another. There are pros and cons for a linear probability model, as you have mentioned. System GMM is then just the estimator. That said, there might be a higher risk that lagged levels are weak instruments for the differences of a binary variable (and vice versa) but this depends on the particular characteristics of the data.

https://www.kripfganz.de/stata/
1 like
Comment

Alex Mai

Join Date: May 2016
Posts: 213

#36

20 Apr 2018, 15:58

For example, if the Arellano-Bond AR(2) test does not reject the null hypothesis of no second-order serial correlation of the first-differenced errors, then you usually need not separately justify the lagged levels of the dependent variable as instruments for the first-differenced model. In contrast, the difference-in-Hansen test for the level instruments is informative because it helps to evaluate whether the Blundell-Bond mean stationarity assumption might be violated.

Dear Sebastian,

With respect to your advice quoted, can I understand in the way that a model is still valid and dynamic complete if AR(2) rejects the null but the Hansen test for lagged dependent variable in level as instruments for the first-differenced model does not reject the null?

I encounter the following situation in which AR(2) rejects the null, but all Difference-in-Hansen tests do not reject the null. Following your previous suggestions, I have tried adding the second lag of the dependent variable as regressor, but it hardly improves the p-value of AR(2).

Code:

------------------------------------------------------------------------------
Arellano-Bond test for AR(1) in first differences: z =  -3.12  Pr > z =  0.002
Arellano-Bond test for AR(2) in first differences: z =  -1.95  Pr > z =  0.051
------------------------------------------------------------------------------
Sargan test of overid. restrictions: chi2(22)   =  80.86  Prob > chi2 =  0.000
  (Not robust, but not weakened by many instruments.)
Hansen test of overid. restrictions: chi2(22)   =  24.89  Prob > chi2 =  0.302
  (Robust, but weakened by many instruments.)

Difference-in-Hansen tests of exogeneity of instrument subsets:
  GMM instruments for levels
    Hansen test excluding group:     chi2(14)   =  17.86  Prob > chi2 =  0.213
    Difference (null H = exogenous): chi2(8)    =   7.03  Prob > chi2 =  0.534
  iv(x1, eq(level))
    Hansen test excluding group:     chi2(21)   =  23.92  Prob > chi2 =  0.297
    Difference (null H = exogenous): chi2(1)    =   0.98  Prob > chi2 =  0.323
  iv(x2 x3, eq(level))
    Hansen test excluding group:     chi2(20)   =  21.83  Prob > chi2 =  0.350
    Difference (null H = exogenous): chi2(2)    =   3.06  Prob > chi2 =  0.217
  iv(x4, eq(level))
    Hansen test excluding group:     chi2(21)   =  22.48  Prob > chi2 =  0.372
    Difference (null H = exogenous): chi2(1)    =   2.41  Prob > chi2 =  0.121
  iv(dummy_1 year4 year5 year6 year7 year8 year9 year10, eq(level))
    Hansen test excluding group:     chi2(13)   =  18.02  Prob > chi2 =  0.157
    Difference (null H = exogenous): chi2(9)    =   6.87  Prob > chi2 =  0.650

Thank you!

Last edited by Alex Mai; 20 Apr 2018, 16:04.

Comment

Sebastian Kripfganz

Join Date: May 2014

Posts: 2594
#37

21 Apr 2018, 07:46

In a very strict sense, the AR(2) test does not reject the null hypothesis at the 5% significance level. Of course, such a marginal non-rejection is hard to defend.

If the AR(2) test rejects the null but the respective (Difference-in-)Hansen test does not reject the null, this would be a contradiction. However, such conflicting test results can generally happen. Remember that a correctly sized test would reject the null hypothesis at the 5% significance level still in 5% of the cases even though the null hypothesis is actually true. Conversely, a test may not reject the null hypothesis even though it is wrong.

You can either try to find a more robust specification or you would have to make a judgement whether you trust the joint evidence of the two tests in favor of the null hypothesis.

https://www.kripfganz.de/stata/
Comment
Alex Mai

Join Date: May 2016

Posts: 213
#38

24 Apr 2018, 13:42

Thank you so much once more! Since I prefer to use a unified specification of regressors for all regressions on a particular dependent variable, I think I will have to make a judgement on whether the conflicting evidence of AR(2) and Difference-in-Hansen supports the model validity.
Comment
Pian Chen

Join Date: Dec 2018

Posts: 2
#39

19 Dec 2018, 09:37

I also have some questions about the "Hansen test excluding group" for "GMM instruments for levels." I fail to reject the null of the "Arellano-Bond test for AR(2) in first differences", which means the idiosyncratic error terms are not serially correlated. But the null of the "Hansen test excluding group" for "GMM instruments for levels" is rejected. Is it correct that this Hansen test evaluates the joint validity of the GMM instruments used for the level equation? What are the GMM instruments used for the level equation if the forward orthogonal deviation option is specified? When the null is rejected, should I drop the GMM instruments for the level equation and only use the GMM instruments for the transformed equation, i,e, adding eq(diff) in GMM( ) ? But when I do that, I reject the null of the Hansen test of overid restrictions. Please see the regression output below. Thanks for your help.

Original estimator:
. xtabond2 ln_avg_price_qtr_cust l.ln_avg_price_qtr_cust ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_cheese qtr_2 qtr_3 qtr_4, gmm(l.ln_avg_price_qtr_cust, lag(1 3)) iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_cheese qtr_2 qtr_3 qtr_4 ) twostep robust orthogonal small
Favoring space over speed. To switch, type or click on mata: mata set matafavor speed, perm.
Warning: Two-step estimated covariance matrix of moments is singular.
Using a generalized inverse to calculate optimal weighting matrix for two-step estimation.
Difference-in-Sargan/Hansen statistics may be negative.

Dynamic panel-data estimation, two-step system GMM

Group variable: rank_cheese Number of obs = 6594
Time variable : yq Number of groups = 200
Number of instruments = 196 Obs per group: min = 5
F(7, 199) = 621.12 avg = 32.97
Prob > F = 0.000 max = 49

Corrected
ln_avg_price_qtr_cust Coef. Std. Err. t P>t [95% Conf. Interval]

ln_avg_price_qtr_cust
L1. .8525751 .0302117 28.22 0.000 .7929989 .9121514

ln_qtr_act_milk_production -1.294157 .0899475 -14.39 0.000 -1.47153 -1.116785
ln_qtr_pers_disp_income 1.312546 .0885666 14.82 0.000 1.137897 1.487195
ln_qtr_cost_index_cheese .0443752 .0137362 3.23 0.001 .0172881 .0714623
qtr_2 .0930413 .0063891 14.56 0.000 .0804422 .1056405
qtr_3 .0633145 .0032028 19.77 0.000 .0569987 .0696303
qtr_4 .013382 .0041955 3.19 0.002 .0051086 .0216553
_cons 17.92212 1.447264 12.38 0.000 15.06818 20.77606

Instruments for orthogonal deviations equation
Standard
FOD.(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly
ln_qtr_cost_index_cheese qtr_2 qtr_3 qtr_4)
GMM-type (missing=0, separate instruments for each period unless collapsed)
L(1/3).L.ln_avg_price_qtr_cust
Instruments for levels equation
Standard
ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly
ln_qtr_cost_index_cheese qtr_2 qtr_3 qtr_4
_cons
GMM-type (missing=0, separate instruments for each period unless collapsed)
D.L.ln_avg_price_qtr_cust

Arellano-Bond test for AR(1) in first differences: z = -3.84 Pr > z = 0.000
Arellano-Bond test for AR(2) in first differences: z = 1.56 Pr > z = 0.118

Sargan test of overid. restrictions: chi2(188) =2200.52 Prob > chi2 = 0.000
(Not robust, but not weakened by many instruments.)
Hansen test of overid. restrictions: chi2(188) = 199.26 Prob > chi2 = 0.273
(Robust, but weakened by many instruments.)

Difference-in-Hansen tests of exogeneity of instrument subsets:
GMM instruments for levels
Hansen test excluding group: chi2(140) = 194.66 Prob > chi2 = 0.002
Difference (null H = exogenous): chi2(48) = 4.61 Prob > chi2 = 1.000
iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_cheese qtr_2 qtr_3 qtr_4)
Hansen test excluding group: chi2(182) = 198.93 Prob > chi2 = 0.185
Difference (null H = exogenous): chi2(6) = 0.33 Prob > chi2 = 0.999

Last edited by Pian Chen; 19 Dec 2018, 09:43.
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2594
#40

20 Dec 2018, 04:28

The "Hansen text excluding group" for the "GMM instruments for levels" is evaluating the instruments for the transformed model only. Notice that the Hansen test is "weakened ny many instruments". In your case, 196 instruments are clearly too many relative to your sample size. In particular given your highly unbalanced panel, I recommend to use the collapse option.

Do you expect the option iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_cheese qtr_2 qtr_3 qtr_4) to generate separate instruments for the transformed and the level model? This is not what it is doing! You would have to specify them separately by using the equation() suboption. Moreover, using the iv() option together with the orthogonal option is potentially dangerous due to the way how the forward-orthogonal deviations are implemented in xtabond2. For more information, see:
XTDPDGMM: new Stata command for efficient GMM estimation of linear (dynamic) panel models with nonlinear moment conditions

https://www.kripfganz.de/stata/
Comment
Pian Chen

Join Date: Dec 2018

Posts: 2
#41

21 Dec 2018, 10:00

Thanks, Sebastian! The xtabond2 documentation is very confusing. I tried the following three codes (difference bolded) and got identical results. I do not quite understand the way xtabond2 generates instruments for the transformed model and the level model. Would you please shed some lights? I also have no idea about the potential danger caused by using iv() and the forward-orthogonal deviations. I guess it is probably quite technical, but would like to understand it more so that I am not using the command blindly.

xtabond2 ln_avg_price_qtr_cust l.ln_avg_price_qtr_cust ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, gmm(l.ln_avg_price_qtr_cust , lag(1 3)) iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4 ) twostep robust orthogonal small

xtabond2 ln_avg_price_qtr_cust l.ln_avg_price_qtr_cust ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, gmm(l.ln_avg_price_qtr_cust , lag(1 3)) iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, equation(both)) twostep robust orthogonal small

xtabond2 ln_avg_price_qtr_cust l.ln_avg_price_qtr_cust ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, gmm(l.ln_avg_price_qtr_cust, lag(1 3) equation(both)) iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, equation(both)) twostep robust orthogonal small

I have also tried your command and collapsed the gmm ivs. If I want to use gmmiv and standard iv in both the transformed and level models (system GMM), what should I do? I could not get the code to run if use "m(fodev|level)" or "m(fodev level)".

xtdpdgmm ln_avg_price_qtr_cust l.ln_avg_price_qtr_cust ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, gmmiv(l.ln_avg_price_qtr_cust, lag(1 3) collapse m(fodev)) iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, m(fodev)) twostep vce(robust)

xtdpdgmm ln_avg_price_qtr_cust l.ln_avg_price_qtr_cust ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, gmmiv(l.ln_avg_price_qtr_cust, lag(1 3) collapse m(level)) iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, m(level)) twostep vce(robust)

Thank you so much!
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2594
#42

21 Dec 2018, 12:04

When using xtabond2, the suboption equation(both) is equivalent to not typing anything. But notice that it is not equivalent to the combination of the following:
iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, equation(diff))
iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, equation(level))
If you are surprised that this yields different results, then what you want to get is probably the version with separately specified instruments for the two equations.

With xtdpdgmm, you cannot specify both arguments at once, m(fodev level). Similar to the above two lines, you would need to jointly specify
iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, m(fodev))
iv(ln_qtr_act_milk_production ln_qtr_pers_disp_income_monthly ln_qtr_cost_index_butter qtr_2 qtr_3 qtr_4, m(level))

That said, once you have specified those standard instruments for the model in levels, the corresponding instruments for the transformed model become redundant. This is not the case for GMM-type instruments if they are differenced for the model in levels (which is not done by default for standard instruments). Notice further that an underlying assumption of these standard instruments for the level model is that they are all uncorrelated with the idiosyncratic error term and the unit-specific error component (the "fixed effects"). This is a strong assumption which might be difficult to justify in some cases.

The issue with the implementation of forward-orthogonal deviations in xtabond2 is indeed quite technical. I cannot really say more about it than what I was saying in the post that you can reach by clicking on the link in my post #40 above.

https://www.kripfganz.de/stata/
Comment
Netty Drori

Join Date: Aug 2015

Posts: 14
#43

27 Mar 2019, 15:51

Originally posted by Alex Mai View Post

Thanks a lot! Btw, do you think that xtabond2 is suitable for (dynamic) Linear Probability Model with binary dependent variable? As far as I know perhaps there is not stata command for dynamic logit model. I do see someone uses xtabond2 to estimate binary dependent in a dynamic situation.

One problem with Linear Probability Model is the possibility of negative fitted value, but studies have shown that except extreme situations (e.g. probability like 99% or 1%) the odds ratios are almost linear function of probability, which supports the use of Linear Probability Model. But I am not sure if this also holds for System GMM.

Hi Alex,
Could you please provide us the references of the studies that used xtabond2 to estimate binary dependent variable?
In addition, please provide us the reference for the study regarding the second sentence I underlined.
Thank you an advance!
Netty
Comment
Slaven Savic

Join Date: Jul 2018

Posts: 9
#44

04 May 2019, 07:32

Dear,

I also have trouble regarding Hansen tests.

I am now doing several "experiments" with various instrumental variables. When I perform xtabond2 command with original data(variables), results are:

Arellano-Bond test for AR(1) in first differences: z = -2.10 Pr > z = 0.036
Arellano-Bond test for AR(2) in first differences: z = -0.98 Pr > z = 0.327
Hansen test of overid. restrictions: chi2(4) = 4.48 Prob > chi2 = 0.344
(Robust, but weakened by many instruments.)
Difference-in-Hansen tests of exogeneity of instrument subsets:
iv(lnfdi tdummyeu, eq(level))
Hansen test excluding group: chi2(2) = 1.47 Prob > chi2 = 0.480
Difference (null H = exogenous): chi2(2) = 3.02 Prob > chi2 = 0.221,

but, after introducing new instrumental variable(which is in logarithm), results are:

Arellano-Bond test for AR(1) in first differences: z = -1.85 Pr > z = 0.064
Arellano-Bond test for AR(2) in first differences: z = -0.91 Pr > z = 0.362
Hansen test of overid. restrictions: chi2(5) = 4.62 Prob > chi2 = 0.464
(Robust, but weakened by many instruments.)
Difference-in-Hansen tests of exogeneity of instrument subsets:
GMM instruments for levels
Hansen test excluding group: chi2(0) = 0.00 Prob > chi2 = .
Difference (null H = exogenous): chi2(5) = 4.62 Prob > chi2 = 0.464
iv(lnfdi tdummyeu ln_gov_right1, eq(level))
Hansen test excluding group: chi2(2) = 1.52 Prob > chi2 = 0.468
Difference (null H = exogenous): chi2(3) = 3.10 Prob > chi2 = 0.376.

Do I make any mistake? My doubt is formatted as bold.

Thank you.
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2594
#45

05 May 2019, 03:17

Originally posted by Slaven Savic View Post

GMM instruments for levels
Hansen test excluding group: chi2(0) = 0.00 Prob > chi2 = .

This test has 0 degrees of freedom. This means that after excluding the instruments for the levels model, the estimator is just identified. There are no overidentifying restrictions anymore that could be tested. In other words, with your model specification, you cannot test the validity of all overidentifying restrictions resulting from the levels model jointly unless you add more instruments for the first-differenced model.

There is not necessarily anything wrong with that as long as you are happy to go with the untested assumption that these instruments are valid. Since the overall Hansen test is fine, that could be justified.

https://www.kripfganz.de/stata/
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment