Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Overid test after etregress?

    Dear all,

    I am posting a discussion we have had with Mark in order to get some others' ideas on this. I would like to know from your experience if there is a test of overidentification after etregress (e.g. LR test) and what exactly the null hypothesis would be if so.
    ____________________________________________

    From: Ferreira Sequeda Maria
    Sent: 04 December 2014 15:51
    To: Schaffer, Mark
    Subject: Question about overid test after etregress

    Dear Mark,

    I am working with etregress (previous treatreg) and would like to know if there is a way to test the validity of my instrument. I read this from some time ago http://www.stata.com/statalist/archi.../msg00924.html and this http://www.stata.com/statalist/archi.../msg00924.html but I am still a bit confuse.

    My model say is

    Y = a + b*T + c*X1 + d*X2
    T = f + g*X1 + i*X3

    X3 is my instrument.

    If I understand correctly, to run a LR overid test I should do

    (1) etregress Y X1 X2, treat(T X1 X3)
    est sto overid

    (2) etregress Y X1 X2 X3, treat(T X1 X2 X3)
    est sto justid

    lrtest justid overid, df(2)

    (2) means
    Y = a + b*T + c*X1 + d*X2 +e*X3
    T = f + g*X1 + h*X2 + i*X3

    What exactly is the null? I do not understand well what I am testing. According to your entries in statalist, Ho test that e=0 … and probably also in mi case h=0?
    If that is true, then I say that my instrument is valid if the p-value obtained>0.05. Is that correct?

    How can I understand that LR assumes that overid is NESTED in justid?

    If I would do the following, is it incorrect?

    (3) etregress Y X1 X2, treat(T X1 X3)
    est sto overid1

    (2) etregress Y X1 X2, treat(T X1) ----> which I assumed is the model just identified through the normality assumption (when no instrument is used)
    est sto justid1

    lrtest justid1 overid1, df(1)
    LR test would assume justid1 is NESTED in overid1

    Would not it be in this case that Ho test i=0, then my instrument is valid if the p-value obtained<0.05.

    I appreciate if you can help me to clear this issue.

    Best,

    Maria

    ___________________________
    From: Schaffer, Mark
    Sent: donderdag 4 december 2014 18:30
    To: Ferreira Sequeda Maria (ALGEC)
    Subject: RE: Question about overid test after etregress

    Maria,

    Short answer: I think it should be

    (1) etregress Y X1 X2, treat(T X1 X2)
    est sto justid

    (2) etregress Y X1 X2, treat(T X1 X2 X3)
    est sto overid

    In #1, everything goes in both the main and treatment equation. In #2, the excluded instrument X3 gets added to the treatment equation.

    But if you want to follow up, can you post this to Statalist and we can discuss there? That way others can benefit from the discussion, and/or you will get better or faster comments than from me.

    Best wishes,
    MS

  • #2
    Whoops ... that's wrong - sorry. In fact, even the syntax for etregress is wrong.

    Here is something that I think is right, but I'd welcome comments from other Statalisters.

    Code:
    etregress Y X1 X2 X3, treat(T = X1 X2 X3)
    est sto et1
    Code:
    etregress Y X1 X2, treat(T = X1 X2 X3)
    est sto et2
    And then you can test the exclusion restriction for X3 either by doing a Wald test using et1:

    Code:
    test [Y]X3
    Or you can do an LR test instead:

    Code:
    lrtest et1 et2
    But I confess I haven't given this as much thought as it deserves, so caveat emptor. The only other thing I can think of is pointing you to what looks like a relevant paper by Jeff Wooldridge:

    Wooldridge (2012),
    HTML Code:
    <a href="http://econ.msu.edu/faculty/wooldridge/docs/qmle_endog_r3.pdf">"Quasi-Maximum Likelihood Estimation and Testing for Nonlinear Models with Endogenous Explanatory Variables"</a>

    Comment


    • #3
      Hmmm... that hyperlink at the end wasn't very successful, was it?

      Comment


      • #4
        Thanks Mark for your suggestions. It seems that these tests are for endogeneity and not for overidentification. Dont you think so?

        Comment


        • #5
          Yes ... I thought the framework in the paper might also lead you to an overidentification test, but I confess I haven't read it closely enough to work that out for myself.

          There is another possibility for an overidentification test, namely a Hausman test. This is a bit tricky, because etregress is a nonlinear system estimator. And when I tried fiddling around with the hausman command, it seemed to use only the outcome equation. I think [sic!] this means that if you want to use this approach, you should use the two-step estimator and force the degrees of freedom to be 1 (=number of "endogenous regressors"). Estimate an exactly identified version (no exclusion restrictions, identification via functional form) vs. an overidentified version, and then call hausman:

          Code:
          etregress Y X1 X2, treat(T = X1 X2) twostep
          est sto et1
          
          etregress Y X1 X2, treat(T = X1 X2 X3) twostep
          est sto et2
          
          hausman et1 et2, df(1)

          Comment


          • #6
            Maria: Just a clarification and this comes late, but other young researchers are looking at the forum and take this as a definite source to fix their problems. Your question and notation may be confusing and I may be missing something but when the treatment (T) is endogenous, you need to have (or at least assume) one valid instrument, i.e. achieve exact identification. Having assumed that, then you can test whether the over-identification restriction holds. So, if T is endogenous, you need something else in addition to X3, say X4 to test over-identification. As such, your just-identified equation (1) etregress Y X1 X2, treat(T X1 X2) is not really identified if T is endogenous since it is using the variables X1 and X2 as instruments for T, and none of those are excluded from the main regression (if I understand the etregress code correctly...).

            Comment


            • #7
              Marco, I'm not sure this is the case, or maybe I misunderstand your point. Because this is a nonlinear model, the functional form provides an identifying restriction. To take a modified version of an example from the etregress help file:

              Code:
              webuse union3
              etregress wage age black tenure, treat(union = black tenure) twostep
              There are no exclusion restrictions; both black and tenure appear in the main equation as well as in the selection equation. But the model is still identified and etregress reports output instead of exiting with an error.

              The same idea works with a Heckman-type selection bias estimation, as discussed e.g. here:

              http://www.stata.com/statalist/archi.../msg01390.html

              But as that brief discussion points out, identification through functional form alone is usually a bit dubious, and your advice - make use of an exclusion restriction (if you can find a plausible one!) is sound. Still, identification just via functional form might be OK as part of some specification testing, which was what I was trying to get at.

              Comment


              • #8
                Makes sense, thanks for clarifying!

                Comment


                • #9
                  Dear Mark,

                  Would you mind clarifying whether you think the correct syntax is I) or II) (or both) from the examples below? From your post above (5 Dec 2014, 10:43)
                  you seem to suggest I) is incorrect.

                  Best wishes

                  Paul FV


                  I)
                  etregress Y X1 X2, treat(T = X1 X2) twostep // Just-Identified Model
                  est sto et1

                  etregress Y X1 X2, treat(T = X1 X2 X3) twostep // Over-Identified Model
                  est sto et2

                  hausman et1 et2, df(1)


                  II)
                  etregress Y X1 X2 X3, treat(T = X1 X2 X3) twostep // Just-Identified Model
                  est sto et1

                  etregress Y X1 X2, treat(T = X1 X2 X3) twostep // Over-Identified Model
                  est sto et2

                  hausman et1 et2, df(1)





                  Comment


                  • #10
                    Paul,

                    The etregress syntax allows for two different sets of independent variables:

                    etregress depvar [indepvars], treat(depvar_t = indepvars_t)

                    It's common for the first set of indep vars to be used in the second set as well, so another way to write it would be

                    etregress depvar [indepvars], treat(depvar_t = indepvars additional_indepvars_t)

                    and that's how I think about it at any rate. Written this way, additional_indepvars_t are like "excluded instruments" in an IV estimation.

                    What's different from etregress vs linear IV is that you can leave additional_indepvars_t empty and instead of an underidentified model (which would be case with linear IV), you get exact identification from the (probit, nonlinear) functional form. In other words,

                    etregress depvar [indepvars], treat(depvar_t = indepvars)

                    is legitimate but dubious because it relies on identification through functional form.

                    That said, it give a route to get an overidentification test via a test of this just-identified version vs an overidentified version (where additional_indepvars_t isn't empty): estimate the just-identified version, estimate the overidentified version, and compare (statistically). This corresponds to synax (I) in your example above.

                    --Mark

                    Comment


                    • #11
                      Dear Mark,
                      I have the same questions than Maria. Your answer were short and I'd like to know if It's possible to hase answers to others questions. I have the same problem. One instrumental variable. I use eprobit and LR test is a possivle postestimetion command. Id like to use it to verify my instrument' validity. So, what is the null hypothesis? When can I say that my instrument is valid?

                      Y= (X1 X2) entreat (T=X1 X2)
                      est store trjustid
                      Y=(X1 X2) entreat (T=X1 X2 X3)
                      est store troverid
                      lrtest troverid trjustid

                      LR statistics is no significative with Pr> chi2= 0,30. What about the validity of my instrument please?

                      THANKS

                      Comment

                      Working...
                      X