IV-RE and C statistic

Tunga Kantarci

Join Date: Oct 2015

Posts: 90
#1

IV-RE and C statistic

19 Jul 2018, 12:26

xtivreg2 dependent (endogenous_variable = instrument_1 instrument_2 instrument_3) other_exogenous_variable, fe cluster(panel_id) orthog(instrument_3)

estimates an IV-FE regression and gives the C statistic for instrument_3.

Is there a way to obtain the C statistic after estimating an IV-RE regression?
Tags: None
Tunga Kantarci

Join Date: Oct 2015

Posts: 90
#2

20 Jul 2018, 08:09

I figure that the C statistic is very straightforward. One needs to just obtain two J statistics from two regressions, one using an unrestricted and another using a restricted set of instruments. Then take the difference, and check the corresponding Chi2 p-value. Since xtoverid, noi gives the J statsitic after xtivreg, re, the C statistic can easily be calculated. xtivreg2, fe automatically produces the C statistic.

This said, there is one remaining question.

In Output 1 below I use 6 instruments and calculate the J statistic using xtivreg2, fe. It is 1.721.
In Output 2 below I use 3 instruments and calculate the J statistic using xtivreg2, fe. It is 0.725.

Then I thought the C statistic would be 1.721-0.725. To verify this,

In Output 3 below I use 6 instruments and calculate the C statistic using xtivreg2, fe orthog(ELAP2 ELAP3 ELAP4).

The C statistic is calculated as 1.721-0.044. Where is this 0.044 coming from? I expected 0.725 to be used instead of 0.044. Where is the mistake?

Output 1
------------------------------------------------------------------------------
Hansen J statistic (overidentification test of all instruments): 1.721
Chi-sq(4) P-val = 0.7869
Instrumented: HHHP HHHF
Included instruments: AII AII2
Excluded instruments: ELA2 ELA3 ELA4 ELAP2 ELAP3 ELAP4
------------------------------------------------------------------------------

Output 2
------------------------------------------------------------------------------
Hansen J statistic (overidentification test of all instruments): 0.725
Chi-sq(1) P-val = 0.3944
Instrumented: HHHP HHHF
Included instruments: AII AII2
Excluded instruments: ELA2 ELA3 ELA4
------------------------------------------------------------------------------

Output 3
------------------------------------------------------------------------------
Hansen J statistic (overidentification test of all instruments): 1.721
Chi-sq(4) P-val = 0.7869
-orthog- option:
Hansen J statistic (eqn. excluding suspect orthog. conditions): 0.044
Chi-sq(1) P-val = 0.8341
C statistic (exogeneity/orthogonality of suspect instruments): 1.677
Chi-sq(3) P-val = 0.6420
Instruments tested: ELAP2 ELAP3 ELAP4
Instrumented: HHHP HHHF
Included instruments: AII AII2
Excluded instruments: ELA2 ELA3 ELA4 ELAP2 ELAP3 ELAP4
------------------------------------------------------------------------------
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2585
#3

20 Jul 2018, 08:44

I am suspecting that xtivreg2 computes the "Hansen J statistic (eqn. excluding suspect orthog. conditions)" by removing the respective moments ex post after estimating the full model instead of reestimating the reduced model. Asymptotically, under the null hypothesis of correct model specification, both approaches are equivalent. In finite samples, the estimates differ as in your case.

Thus, whether you subtract 0.725 or 0.044 yields both asymptotically valid (and asymptotically equivalent) test statistics.

https://www.kripfganz.de/stata/
Comment
Tunga Kantarci

Join Date: Oct 2015

Posts: 90
#4

20 Jul 2018, 09:19

Thanks Sebastian. I am not so clear about what this implies for how one should calculate the statistic in a finite sample case. 1-chi2(3,1.721-0.725) will give 0.8022 while 1-chi2(3,1.721-0.044) will give 0.6420.
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2585
#5

21 Jul 2018, 04:51

It is not a priori clear which of the two versions is "better" in finite samples. The way how xtivreg2 computes the test statistic has the advantage that the C statistic is guaranteed to be nonnegative. If you instead estimate the two models separately, it can happen (and it occasionally does happen) that the C statistic turns out to be negative in finite samples which makes it useless.

In some cases, the xtivreg2 computation also comes with a disadvantage. If some of the removed moment conditions / instruments are necessary for the identification of some coefficients in the model, i.e. those coefficients would be unidentified due to orthogonality of the corresponding regressors to the remaining moment conditions, then xtivreg2 would not detect it and would compute the degrees of freedom for the restricted J statistic and the C statistic incorrectly. (The degrees of freedom need to be further reduced by the number of unidentified coefficients.)

Whether one or the other has better finite-sample size / power properties would have to be investigated with Monte Carlo simulations.

As a side note: The same discussion applies to the computation of the Difference-in-Hansen statistic by xtabond2.

Last edited by Sebastian Kripfganz; 21 Jul 2018, 04:59. Reason: Paragraph about disadvantage of xtivreg2 computation and side note about xtabond2 added.

https://www.kripfganz.de/stata/
Comment
Tunga Kantarci

Join Date: Oct 2015

Posts: 90
#6

21 Jul 2018, 09:12

Thanks for the follow-up. The documentation of ivreg2 and Hayashi's book clarifies a couple of points. My understanding of the enclosed excerpt from Hayashi is that the statistic is guaranteed to be non-negative by using the weighting matrix based on using the full set of instruments when calculating the J statistic based on the full set of instruments as well as the J statistic based on the restricted set of instruments. The documentation of ivreg2 seems to confirm this: "To guarantee that the C statistic is non-negative in finite samples, the estimated covariance matrix of the full set orthogonality conditions is used to calculate both Sargan-Hansen statistics (in the case of simple IV/2SLS, this amounts to using the MSE from the unrestricted equation to calculate both Sargan statistics)."

Hayashi's sentence "This is accomplished by taking the following steps: ..." right at the end of the same paragraph where he explains how the test can be made non-negative is confusing, however. I guess this sentence has nothing to do with the discussion on making the statistic non-negative, but that it just explains how the possibly negative statistic is calculated in two steps.

Two questions still remain.

1. I struggle with how I should choose between the possibly negative and non-negative versions of the Difference-in-Sargan statistic. 0.8022 and 0.6420 in my previous post are just two different p-values. Suppose that the p-values were 0.15 and 0.04 (instead of 0.8022 and 0.6420). In the former case I would fail to reject the test and in the latter case I would reject. What one is supposed to conclude in this case?

2. Coming back to my original question. Is there a way to obtain the non-negative version of the Difference-in-Sargan statistic in the IV-RE model using xtivreg, re?

Last edited by Tunga Kantarci; 21 Jul 2018, 09:15.
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2585
#7

21 Jul 2018, 10:28

Thanks for referencing Hayashi. That's useful.
If one rejects but the other does not, this would be interpreted as inconclusive evidence. You should not treat such a p-value as a hard fact anyway. Just be cautious in your interpretation. It is dangerous anyway to use a fixed cut-off value, say, of 5% to come up with an ultimate answer. If the p-value is 0.049 you can be hardly more confident about the rejection of the null hypothesis than if the p-value is 0.051.

As far as I can tell from the estimation results stored by xtivreg, it does not seem possible to compute the C statistic from those results.

https://www.kripfganz.de/stata/
Comment
Tunga Kantarci

Join Date: Oct 2015

Posts: 90
#8

21 Jul 2018, 11:10

1. The issue arises if the difference between the two p-values is large in an empirical application. If one p-value is 0.30 and the other p-value one is 0.04, then this is odd. If this is acknowledged somewhere, I am fine.

2. Yes. It seems the possibly negative version of the test can be computed using the J statistics produced by xtoverid after each IV-RE regression. But if the statistic is negative, I guess I will need to program the non-negative version myself from scratch. I would appreciate if the authors of xtivreg2 could comment on whether there is a way around.
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2585
#9

21 Jul 2018, 11:21

I agree that this would be unfortunate in an empirical application. Such a large difference between the two versions might indicate difficulties in estimating the weighting matrix, possibly due to a large correlation among the instruments. In that regard, I would try to check whether the model specification could be improved. If you have to decide between the two statistics, I would probably go with the one that yields the smaller p-value and thus a more conservative test decision.

https://www.kripfganz.de/stata/
Comment
Tunga Kantarci

Join Date: Oct 2015

Posts: 90
#10

21 Jul 2018, 11:25

1. Agree. The researcher is obliged to report both in this case anyway.
Comment
Tunga Kantarci

Join Date: Oct 2015

Posts: 90
#11

22 Jul 2018, 07:09

The following steps might allow calculating the non-negative version of the Difference-in-Sargan statistic for the IV-RE model. Please let the thread know if this is incorrect.

1. Carry out the xtivreg, re regression using the full instrument set. Use xtoverid, noi to obtain the J statsitic based on the full instrument set.

2. Save all the the GLS transformed variables (used for RE estimation) after xtoverid. This is possible.

3. Obtain the estimated covariance matrix of the full set of orthogonality conditions, or the inverse of it which is the weighting matrix, after xtoverid. Is this possible?

4. Use the "gmm" command to estimate the IV regression using the GLS transformed variables from step 2. But only use the restricted set of instruments in this estimation. The "gmm" command allows to specify a weighting matrix. Use the weighting matrix calculated using the full instrument set obtained in step 3. This step guarantees that the Difference-in-Sargan statistic is nonnegative because the same weighing matrix in being used in step 1 and in step 4, according to Hayashi (see the above post).

5. Use estat overid to obtain the J statistic based on the restricted instrument set.

6. Take the difference between the two J statistics from step 1 and step 5. This is the non-negative version of the Difference-in-Sargan statistic.

I have two questions:

1. Is it possible to extract the weighting matrix in step 3 after xtoverid?

2. Is it possible to adjust the described steps to make the Difference-in-Sargan statistic robust?
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2585
#12

22 Jul 2018, 07:56

Step 2: Are you manually creating the GLS-transformed variables based on the variance estimates?
Step 3: You would need to find out how xtoverid technically computes the J statistic. According to its help file, it uses the ivreg2 command. The latter actually seems to store a weighting matrix under e(W). If you manage to replicate an xtivreg, re estimation with ivreg2, it would be great if you could post an example here.

All you need for a robust C statistic is a robust J statistic for the full model, isn't it? xtoverid has a robust option and ivreg2 has a cluster() option.

https://www.kripfganz.de/stata/
Comment
Tunga Kantarci

Join Date: Oct 2015

Posts: 90
#13

22 Jul 2018, 08:36

Step 2. No. xtoverid itself creates the transformed variables. That is why in the output of xtoverid variable names are changed by xtoverid.

Step 3. I am guessing that carrying out ivreg2 using the transformed variables should give what xtiverg2, re gives. Then I could obtain the W matrix. Thank you indeed for pointing this out. If this works, I will post the example here.

It might well be possible to obtain the robust version as you suggest.

If one of the authors of ivreg2 would step in here to confirm the steps, it would be great.
Comment

Announcement

IV-RE and C statistic

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment