Overid test after etregress?

MariaF

Join Date: Dec 2014

Posts: 2
#1

Overid test after etregress?

05 Dec 2014, 04:37

Dear all,

I am posting a discussion we have had with Mark in order to get some others' ideas on this. I would like to know from your experience if there is a test of overidentification after etregress (e.g. LR test) and what exactly the null hypothesis would be if so.
____________________________________________

From: Ferreira Sequeda Maria
Sent: 04 December 2014 15:51
To: Schaffer, Mark
Subject: Question about overid test after etregress

Dear Mark,

I am working with etregress (previous treatreg) and would like to know if there is a way to test the validity of my instrument. I read this from some time ago http://www.stata.com/statalist/archi.../msg00924.html and this http://www.stata.com/statalist/archi.../msg00924.html but I am still a bit confuse.

My model say is

Y = a + b*T + c*X1 + d*X2
T = f + g*X1 + i*X3

X3 is my instrument.

If I understand correctly, to run a LR overid test I should do

(1) etregress Y X1 X2, treat(T X1 X3)
est sto overid

(2) etregress Y X1 X2 X3, treat(T X1 X2 X3)
est sto justid

lrtest justid overid, df(2)

(2) means
Y = a + b*T + c*X1 + d*X2 +e*X3
T = f + g*X1 + h*X2 + i*X3

What exactly is the null? I do not understand well what I am testing. According to your entries in statalist, Ho test that e=0 … and probably also in mi case h=0?
If that is true, then I say that my instrument is valid if the p-value obtained>0.05. Is that correct?

How can I understand that LR assumes that overid is NESTED in justid?

If I would do the following, is it incorrect?

(3) etregress Y X1 X2, treat(T X1 X3)
est sto overid1

(2) etregress Y X1 X2, treat(T X1) ----> which I assumed is the model just identified through the normality assumption (when no instrument is used)
est sto justid1

lrtest justid1 overid1, df(1)
LR test would assume justid1 is NESTED in overid1

Would not it be in this case that Ho test i=0, then my instrument is valid if the p-value obtained<0.05.

I appreciate if you can help me to clear this issue.

Best,

Maria

___________________________
From: Schaffer, Mark
Sent: donderdag 4 december 2014 18:30
To: Ferreira Sequeda Maria (ALGEC)
Subject: RE: Question about overid test after etregress

Maria,

Short answer: I think it should be

(1) etregress Y X1 X2, treat(T X1 X2)
est sto justid

(2) etregress Y X1 X2, treat(T X1 X2 X3)
est sto overid

In #1, everything goes in both the main and treatment equation. In #2, the excluded instrument X3 gets added to the treatment equation.

But if you want to follow up, can you post this to Statalist and we can discuss there? That way others can benefit from the discussion, and/or you will get better or faster comments than from me.

Best wishes,
MS
Tags: None
Mark Schaffer

Join Date: Mar 2014

Posts: 324
#2

05 Dec 2014, 10:43

Whoops ... that's wrong - sorry. In fact, even the syntax for etregress is wrong.

Here is something that I think is right, but I'd welcome comments from other Statalisters.

Code:

etregress Y X1 X2 X3, treat(T = X1 X2 X3) est sto et1

Code:

etregress Y X1 X2, treat(T = X1 X2 X3) est sto et2

And then you can test the exclusion restriction for X3 either by doing a Wald test using et1:

Code:

test [Y]X3

Or you can do an LR test instead:

Code:

lrtest et1 et2

But I confess I haven't given this as much thought as it deserves, so caveat emptor. The only other thing I can think of is pointing you to what looks like a relevant paper by Jeff Wooldridge:

Wooldridge (2012),

HTML Code:

<a href="http://econ.msu.edu/faculty/wooldridge/docs/qmle_endog_r3.pdf">"Quasi-Maximum Likelihood Estimation and Testing for Nonlinear Models with Endogenous Explanatory Variables"</a>
Comment
Mark Schaffer

Join Date: Mar 2014

Posts: 324
#3

05 Dec 2014, 10:44

Hmmm... that hyperlink at the end wasn't very successful, was it?
Comment
MariaF

Join Date: Dec 2014

Posts: 2
#4

07 Dec 2014, 06:44

Thanks Mark for your suggestions. It seems that these tests are for endogeneity and not for overidentification. Dont you think so?
Comment
Mark Schaffer

Join Date: Mar 2014

Posts: 324
#5

07 Dec 2014, 14:20

Yes ... I thought the framework in the paper might also lead you to an overidentification test, but I confess I haven't read it closely enough to work that out for myself.

There is another possibility for an overidentification test, namely a Hausman test. This is a bit tricky, because etregress is a nonlinear system estimator. And when I tried fiddling around with the hausman command, it seemed to use only the outcome equation. I think [sic!] this means that if you want to use this approach, you should use the two-step estimator and force the degrees of freedom to be 1 (=number of "endogenous regressors"). Estimate an exactly identified version (no exclusion restrictions, identification via functional form) vs. an overidentified version, and then call hausman:

Code:

etregress Y X1 X2, treat(T = X1 X2) twostep est sto et1 etregress Y X1 X2, treat(T = X1 X2 X3) twostep est sto et2 hausman et1 et2, df(1)
Comment
Marco Ceccagnoli

Join Date: May 2016

Posts: 2
#6

23 May 2016, 16:04

Maria: Just a clarification and this comes late, but other young researchers are looking at the forum and take this as a definite source to fix their problems. Your question and notation may be confusing and I may be missing something but when the treatment (T) is endogenous, you need to have (or at least assume) one valid instrument, i.e. achieve exact identification. Having assumed that, then you can test whether the over-identification restriction holds. So, if T is endogenous, you need something else in addition to X3, say X4 to test over-identification. As such, your just-identified equation (1) etregress Y X1 X2, treat(T X1 X2) is not really identified if T is endogenous since it is using the variables X1 and X2 as instruments for T, and none of those are excluded from the main regression (if I understand the etregress code correctly...).
Comment
Mark Schaffer

Join Date: Mar 2014

Posts: 324
#7

23 May 2016, 16:59

Marco, I'm not sure this is the case, or maybe I misunderstand your point. Because this is a nonlinear model, the functional form provides an identifying restriction. To take a modified version of an example from the etregress help file:

Code:

webuse union3 etregress wage age black tenure, treat(union = black tenure) twostep

There are no exclusion restrictions; both black and tenure appear in the main equation as well as in the selection equation. But the model is still identified and etregress reports output instead of exiting with an error.

The same idea works with a Heckman-type selection bias estimation, as discussed e.g. here:

http://www.stata.com/statalist/archi.../msg01390.html

But as that brief discussion points out, identification through functional form alone is usually a bit dubious, and your advice - make use of an exclusion restriction (if you can find a plausible one!) is sound. Still, identification just via functional form might be OK as part of some specification testing, which was what I was trying to get at.
Comment
Marco Ceccagnoli

Join Date: May 2016

Posts: 2
#8

24 May 2016, 06:56

Makes sense, thanks for clarifying!
Comment
Paul Fenton Villar

Join Date: Jul 2019

Posts: 1
#9

01 Jul 2019, 03:48

Dear Mark,

Would you mind clarifying whether you think the correct syntax is I) or II) (or both) from the examples below? From your post above (5 Dec 2014, 10:43)
you seem to suggest I) is incorrect.

Best wishes

Paul FV

I)

etregress Y X1 X2, treat(T = X1 X2) twostep // Just-Identified Model
est sto et1

etregress Y X1 X2, treat(T = X1 X2 X3) twostep // Over-Identified Model
est sto et2

hausman et1 et2, df(1)

II)

etregress Y X1 X2 X3, treat(T = X1 X2 X3) twostep // Just-Identified Model
est sto et1

etregress Y X1 X2, treat(T = X1 X2 X3) twostep // Over-Identified Model
est sto et2

hausman et1 et2, df(1)
Comment
Mark Schaffer

Join Date: Mar 2014

Posts: 324
#10

01 Jul 2019, 08:03

Paul,

The etregress syntax allows for two different sets of independent variables:

etregress depvar [indepvars], treat(depvar_t = indepvars_t)

It's common for the first set of indep vars to be used in the second set as well, so another way to write it would be

etregress depvar [indepvars], treat(depvar_t = indepvars additional_indepvars_t)

and that's how I think about it at any rate. Written this way, additional_indepvars_t are like "excluded instruments" in an IV estimation.

What's different from etregress vs linear IV is that you can leave additional_indepvars_t empty and instead of an underidentified model (which would be case with linear IV), you get exact identification from the (probit, nonlinear) functional form. In other words,

etregress depvar [indepvars], treat(depvar_t = indepvars)

is legitimate but dubious because it relies on identification through functional form.

That said, it give a route to get an overidentification test via a test of this just-identified version vs an overidentified version (where additional_indepvars_t isn't empty): estimate the just-identified version, estimate the overidentified version, and compare (statistically). This corresponds to synax (I) in your example above.

--Mark
Comment
Salimata TRAORE

Join Date: May 2020

Posts: 1
#11

17 May 2020, 16:49

Dear Mark,
I have the same questions than Maria. Your answer were short and I'd like to know if It's possible to hase answers to others questions. I have the same problem. One instrumental variable. I use eprobit and LR test is a possivle postestimetion command. Id like to use it to verify my instrument' validity. So, what is the null hypothesis? When can I say that my instrument is valid?

Y= (X1 X2) entreat (T=X1 X2)
est store trjustid
Y=(X1 X2) entreat (T=X1 X2 X3)
est store troverid
lrtest troverid trjustid

LR statistics is no significative with Pr> chi2= 0,30. What about the validity of my instrument please?

THANKS
Comment

Announcement

Overid test after etregress?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment