Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Testing differences between FE and FEIV models

    Hi I was hoping to get some help with Stata code for running a test that compares FE and FEIV models.
    In particular how would one test the differences between the FE and FE-IV estimator, using Davidson and MacKinnon (1993) test of exogeneity.
    Where the null hypothesis of this test states that the fixed effect estimator of the specification yields consistent results. Just to clarify, I am using xtivreg2.
    Thank you so much!
    Last edited by Janet Lewis; 02 Jun 2020, 16:35.

  • #2
    xtivreg2 has such tests built in.

    Comment


    • #3
      Originally posted by Phil Bromiley View Post
      xtivreg2 has such tests built in.
      Thank you Phil. I also came across the DMEXOGXT command.
      I have estimated a model with FE and instrumented two independent variables. Oddly, after instrumenting the FEIV results for the two endogenous variables the coefficient signs act up, I tested for multicollienearity and it is the obvious culprit. This multicollienarity is not an issue in the FE estimates. Both instruments are not weak, though I want to clarify that the FE estimates are more consistent than the FEIV ones. What test should I be looking at in terms of xtivreg? Hope this makes sense..I have not posted here very often.

      Comment


      • #4
        Janet: Do you have a balanced panel? You have to be a bit careful if it's unbalanced in the sense that you should always used just the complete cases at every step.

        In the balanced case, you can get the following from my 2010 MIT Press book on page 354, but it doesn't exactly jump off the page. Let w1 and w2 be the two suspected IVs. x the control variables, z the IVs. First, estimate the first stages by regular FE, and obtain the FE residuals. Then these get added to a final FE estimation and do a robust, joint test.

        Code:
        xtreg w1 x z i.year, fe
        predict w1ddh, e
        xtreg w2 x z i.year, fe
        predict w2ddh, e
        xtreg y w1 w2 w1ddh w2ddh x i.year, vce(cluster id)
        test w1ddh w2ddh
        If you cannot reject, then you are not rejecting FE. If you do reject, you prefer FEIV (subject to the usual problems of significance tests). The final FE estimation should give you the FEIV estimates on w1, w2, x, and the year dummies. If it does not, you did something wrong. We now call this the "variable addition test" or "control function test." It's more robust than the usual Hausman test and easy to compute the proper degrees of freedom.

        I teach this method now whenever I give a short course. You can find it here as well: https://www.cemmap.ac.uk/event/id/1056 or any any of the cemmap courses I've given recently.

        JW

        Comment


        • #5
          Thanks Jeff. Although I am using an unbalanced panel.

          Comment


          • #6
            Originally posted by Jeff Wooldridge View Post
            Janet: Do you have a balanced panel? You have to be a bit careful if it's unbalanced in the sense that you should always used just the complete cases at every step.

            In the balanced case, you can get the following from my 2010 MIT Press book on page 354, but it doesn't exactly jump off the page. Let w1 and w2 be the two suspected IVs. x the control variables, z the IVs. First, estimate the first stages by regular FE, and obtain the FE residuals. Then these get added to a final FE estimation and do a robust, joint test.

            Code:
            xtreg w1 x z i.year, fe
            predict w1ddh, e
            xtreg w2 x z i.year, fe
            predict w2ddh, e
            xtreg y w1 w2 w1ddh w2ddh x i.year, vce(cluster id)
            test w1ddh w2ddh
            If you cannot reject, then you are not rejecting FE. If you do reject, you prefer FEIV (subject to the usual problems of significance tests). The final FE estimation should give you the FEIV estimates on w1, w2, x, and the year dummies. If it does not, you did something wrong. We now call this the "variable addition test" or "control function test." It's more robust than the usual Hausman test and easy to compute the proper degrees of freedom.

            I teach this method now whenever I give a short course. You can find it here as well: https://www.cemmap.ac.uk/event/id/1056 or any any of the cemmap courses I've given recently.

            JW
            Sorry for the multiple replies, I seem to be having issues with posting on here (using Chrome). Can I ask what exactly you mean by complete cases? No missing values? Or should I be using e(sample). Thanks

            Comment


            • #7
              Hi Janet:

              Your comment about e(sample) made me think of an easy way to generate the complete cases indicator. I used to do it in a clumsier way.

              So here is what you do in the unbalanced case. The trick is to make sure you only use the same data for each of the three steps, and that is ensured by use e(sample) after xtivreg.

              Code:
              xtivreg y (w1 w2 = z) x i.year, fe vce(cluster id)
              gen s = e(sample)
              xtreg w1 x z i.year if s, fe
              redict w1ddh if s, e
              xtreg w2 x z i.year if s, fe
              predict w2ddh if s, e
              xtreg y w1 w2 w1ddh w2ddh x i.year if s, vce(cluster id)
              test w1ddh w2ddh
              A recent student of mine, Riju Joshi, and I covered this test in the unbalanced case in Section 6 of the paper at the end of this link:

              Joshi and Wooldridge (2019)

              Comment


              • #8
                Thank you Jeff, this is really very helpful. I also appreciate you sharing these resources. I had another question which I was going to start a new thread for though, I was hoping you could help as it is sort of related to this, plus I have been using your textbook as a holy grail for learning econometrics hence I am a little starstruck by your response!

                I have two models for men and women with a dependent variable measure for wellbeing, the main independent variables are own retirement and partner's retirement, I also have controls for both partners age and age2, health, year dummies. I run a linear fixed effects model. Essentially my model is close to the one in this paper: https://doi.org/10.1016/j.labeco.2020.101810
                I instrument each partner's retirement with pension eligibility age, this varies by cohort in the country context I am looking at (eg. some men will be eligible at 65, others at 65.5, for women their has been a reform hence pension ages range from 60-65.5 for different individuals). I do not have strict cut offs as such, as in Bonsang's & Soest's paper. I only have working or retired individuals in my sample.
                There is a concern about multicollienarity as partner's retire around the same time. Though I checked for this and own and partners retirement collinearity is around 0.2 in the FE models -- which seems ok. Though when I run the FEIV models collinearity goes up to 0.8 which is obviously a problem. The instruments are not weak. I wonder if this happens because I am "making" the variable collinear by using xtivreg.

                code:
                xtivreg2 y (own_retirement partner_retirement=own_pension_eligibility partners_pension_eligibility) x i.year, fe, vce (cluster id)

                Stata runs two first stage regressions, though own retirement is instrumented with own pension eligibility AND partners pension eligibility. The same goes for when I instrument partners retirement. I wonder if I manually run my first stage regressions such that, own retirement is only instrumented by own eligibility and partners retirement is instrumented only by partners eligibility then run my second stage with predicted values the problem of increased collinearity between the two will reduce. Though..I wonder what the implications of this is?

                I have been wrecking my head over this for months now! Thanks again!

                Comment


                • #9
                  Hi Janet:

                  I'm glad my suggestions were helpful. Concerning your other question, you should be aware that you are imposing exclusion restrictions on both reduced forms for your endogenous explanatory variables. The key about 2SLS is that it imposes no restrictions on the reduced forms, which is why it is generally preferred for robustness reasons. Of course, that means it can be inefficient.

                  Think of your setup as a three equation system, where, in addition to omitting both pension eligibility variables from the "structural" equation, you are omitting partner's eligibility from own participation and vice versa. You may be willing to live with this. These are testable restrictions by just estimating the first stages and seeing if the robust t statistics are statistically significant.

                  We always face tradeoffs. Imposing restrictions on first stages, or reduced forms, is usually frowned upon. But I think you have a story that at least allows you to check whether the exclusion restrictions could hold.

                  If you implement the procedure the way you're proposing, you do need to do something to obtain the proper standard errors. The panel bootstrap will work if you bootstrap all three stages (the two first stages and the final regression).

                  Jeff

                  P.S. Problem 5.11 in my MIT Press book studies a similar case where exclusion restrictions are made in the first stage, and works through the inconsistency if they fail.

                  Comment


                  • #10
                    Thank you Jeff, this is very helpful! This makes very clear sense to me now. Now I will get back to the books and study bootstrap methods and relevant Stata codes.

                    Comment


                    • #11
                      Originally posted by Jeff Wooldridge View Post
                      Hi Janet:

                      I'm glad my suggestions were helpful. Concerning your other question, you should be aware that you are imposing exclusion restrictions on both reduced forms for your endogenous explanatory variables. The key about 2SLS is that it imposes no restrictions on the reduced forms, which is why it is generally preferred for robustness reasons. Of course, that means it can be inefficient.

                      Think of your setup as a three equation system, where, in addition to omitting both pension eligibility variables from the "structural" equation, you are omitting partner's eligibility from own participation and vice versa. You may be willing to live with this. These are testable restrictions by just estimating the first stages and seeing if the robust t statistics are statistically significant.

                      We always face tradeoffs. Imposing restrictions on first stages, or reduced forms, is usually frowned upon. But I think you have a story that at least allows you to check whether the exclusion restrictions could hold.

                      If you implement the procedure the way you're proposing, you do need to do something to obtain the proper standard errors. The panel bootstrap will work if you bootstrap all three stages (the two first stages and the final regression).

                      Jeff

                      P.S. Problem 5.11 in my MIT Press book studies a similar case where exclusion restrictions are made in the first stage, and works through the inconsistency if they fail.
                      Hi Again Jeff,

                      I had a look at Problem 5.11 and did some further reading. I understand the intuition behind this though very oddly my FE_IV results seem to flip... it seems like the retired coefficients match the FE partner retired coefficients and the partner retired coefficient matches the FE own retirement coefficient. At first I thought I had labelled the predicted values wrong, though this isn't the case. I am using Stata 14 MP.
                      My code looks like this:

                      xtreg retired pension_eligible partner_retired controls i.year if gender==female, fe vce(bootstrap) cluster(xwaveid)
                      predict retiredhat

                      xtreg partner_retired partner_pension_eligible retired controls i.year if gender==female, fe vce(bootstrap) cluster(xwaveid)
                      predict partner_retiredhat

                      xtreg satisfaction retiredhat partner_retiredhat controls i.year if gender==female, fe vce(bootstrap) cluster(xwaveid)

                      The first stage regressions treat retired variables as continuous, from my understanding of IVs this is ok. I cannot figure out what I am doing wrong.
                      Would really appreciate your thoughts. Thank you so so much!

                      Janet

                      Comment

                      Working...
                      X