Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • What is next after a referee rejects an instrumental variable strategy?

    I have a paper that just got rejected. It appears that my identification strategy is not sound enough. Below I describe the critic and seek some advice.

    My sample is elderly aged 55 and over. The dependent is a health outcome. Two independent variables of interest are part-time (working less than 35 hours a week) and full-time work dummies, and hence the base is retired. Since health can affect work decisions (simultaneity), I take an IV approach. As instruments for working part-time and full-time, I use dummies indicating whether the individual has reached age 62 or 65 which are eligibility ages to receive early and normal social security benefits. I also consider age 70 for some reason I do not need to explain. I also consider the same eligibility ages for the partner with the argument being that partner's retirement status could affect the work decisions of the individual. Hence, in total I have six instruments. There is a literature analyzing the effects of retirement on health outcomes using eligibility ages as instruments for the retirement decision. In this literature the base outcome is working any number of hours. Hence, my idea is instead to differentiate between part-time and full-time work, and analyze their effects on health since working different number of hours could have different effects on health. Meanwhile, I also consider fixed effects as the data is panel, but this is irrelevant to discussion here.

    1.png presents the first stage results for the two endogenous variables. In total I have six instruments. The first stage regressions are both linear probability models. Since I have two endogenous variables, the instruments should provide independent sources of exogenous variation for both endogenous variables so that their effects can be identified. Hence, I consider the conditional F statistic (of Agrist and Pischke which is later improved by others; I do not present the results here) which suggests that the instruments are not weak.

    2.png presents the second stage results. The results show that the effect of part-time is much larger than the effect of full-time. But the referee points out a problem. Since both part-time and full time work are dummies, the larger are the first stage coefficients, and so the predicting power of the instruments, the smaller will be the IV coefficient (like in a Wald estimator). Therefore, it is almost mechanical to observe a larger estimated effect of working part-time on the health outcome because almost all instruments better predict the probability of working full-time than they do the probability of working part-time. In fact, a larger effect for part-time is observed for a couple of other health outcomes, supporting the referee's concern.

    I would like to ask two questions:

    1. Given the critic, is the following then a lesson to be learned for the IV method in general? Suppose we have one endogenous variable and two instruments. Suppose we consider one instrument at a time: so no GIV but just IV estimation. Suppose both instruments are valid, equally significant, but that the first instrument has a larger effect on the endogenous variable than the second, in the first stage. If the referee is right, the first instrument will always result in a smaller IV estimate, and the second will result in a larger IV estimate, in the second stage. What do we conclude? If the effect of the instrument is large in the first stage, the IV estimate will be small in the second stage? But I do not recall myself reading about such a problem in any econometric textbook.

    2. How could I proceed? To circumvent the critic, I should find an instrument for working part-time such that the effect of the instrument in the first stage is about the same size as that of the effect of the instrument for working full-time? It is probably not possible to find such an instrument. Should I discard the model all together? Or is there by chance an alternative econometric model I could turn to?
    Attached Files

  • #2
    No one responded quickly to your question. You'll increase your chances of a helpful answer if you follow the FAQ on asking questions - provide Stata code in code delimiters, Stata output, and sample data. Don't post screen shots - we have trouble reading them. Also, try to simplify your code to what is necessary to demonstrate the problem - why would the age of the people matter?

    You'll see your question is far longer than most. It also is really an econometric problem, not at Stata problem (although often folks do get answers to such problems on this list).

    I suspect that the referee's criticism is no right. While quality of instruments matters, I don't think it follows through as simply as stated.

    Comment


    • #3
      Prof. Bromiley, thank you indeed for your suggestions to increase the chances of receiving a reply to my question. I considered providing code and data but decided not to do this because I suspect that the critic of the referee is really a theoretical concern that cannot be tested in some empirical way. Besides this, indeed this is not a Stata question but I frequently see researchers seeking econometric advice in the forum. But again, if my post is not suitable, I have no objection for deletion of my post should the moderator agrees. Otherwise, I do seek an answer and advice. I attached PNG files as suggested in the FAQ, and have explained my question in length to make sure that the context is clear.

      Comment


      • #4
        Here is a personal opinion on what is (is not) on-topic here.

        In principle, anything Stata-related is appropriate, as the FAQ Advice makes explicit.

        It has seemed to me, and I guess to many others, artificial and unnecessary to try to draw a line between statistical questions that are Stata-related and those that aren't, mainly because what so often happens is that a thread starts with a Stata-based question and then morphs into a question of what would be a good method of analysis.

        I don't worry much about a thread that starts with a straight statistical question because they don't seem common.

        I see no reason to single out econometrics here. At a wild guess, economists, econometricians and people using econometrics are the largest single group of users in total, but many other disciplines are represented too.

        As a matter of fact, threads are not deleted unless they are spam.

        Comment


        • #5
          I attached PNG files as suggested in the FAQ,
          On looking at the FAQ, I see you point out an area for possible improvement in it. You are referring to the following advice.

          12.4 Posting image attachments: please do use .png

          Stata graphs or other images should be posted as .png file attachments (start with the Clipboard icon).
          However, this advice is really meant for presentation of those results that are inherently images rather than text. Whereas the advice Phil was thinking of is in the prior section, which unfortunately only talks about code, not about results.

          12.3 How to use CODE delimiters

          Stata code (i.e. the exact commands issued) is very much easier to read if presented as such.
          And this goes on to discuss CODE delimiters at some length. But nowhere did it mention that this advice is equally preferred for posting Stata results as well. Unless I've overlooked something, always possible.

          The advice most often given casually is "copy your commands and output from Stata's Results window and paste them into your Statalist post using CODE delimiters as described in the FAQ."

          Comment


          • #6
            I should have used code delimiters. I was not fully aware that code delimiters are able to present the results undistorted (which is why I preferred picture files as screen shots of results). My ignorance and mistake. Regardless, I am very much hoping to receive some feedback to my questions.

            Comment


            • #7
              I would like to elaborate a little on my questions in the hope of receiving a reply. Let us consider the simple IV estimator. It is given by cov(y,z)/cov(x,z) where y represents the dependent, x represents the single endogenous variable, z represents the single instrument, and there are no other exogenous covariates. According to the referee, if cov(x,z) in the sample is small, the IV estimate will be large. Is this discussed in any econometric textbook? Is this a bias mentioned in the literature? What is the intuition?

              Comment


              • #8
                You may want to try to post your question on cross-validated at https://stats.stackexchange.com/

                Comment


                • #9
                  I should apologize - my comment on the question being more general (I used the term econometric but statistical works as well) rather than Stata-related was because I think Stata-related questions have a higher likelihood of an answer than more general statistical questions on this list serve. I was trying to help Tunga elicit an answer from a more skilled commentator than me. Of course, many of the posted questions and answers are as much about good methods as implementing those methods in Stata.

                  That said, no one has tried to answer Tunga's core question. Foster's suggestion to try stackexchange makes sense. Tunga should also try to find an expert in his or her institution.

                  Comment


                  • #10
                    Prof. Bromiley, I do think that your reply has at least clarified that my question was methodological and not Stata related. I will take Belinda's suggestion (and thanks for this) to post my question to Stackexchange. In fact, I am aware of the forum, but I still preferred to post the question here because my impression is that this forum is much more econometric oriented and the user base is much larger. I asked it here not only to increase the chance of receiving a reply, but also that the discussion on my question (if it is a sensible question) might help others too at some point in the future.

                    I am still seeking some advice. In particular, I tend to agree with the referee's critic but do not really know how I could address the critic.

                    Comment


                    • #11
                      What you're essentially doing in 2SLS is making a predicted value for the endogenous variable. Because the predicted value is created strictly by x's that are uncorrelated with the error term (i.e., are exogenous), then this predicted value is exogenous.

                      The referee is right - the lower the explained variance in this instrumental equation (or in correlation terms, the lower the correlation between the endogenous x and the exogenous instrument z), the noisier the measure is. There is a separate literature that suggests measurement error biases coefficients. The standard result is measurement error biases coefficients toward zero, but this is generally derived only in one variable models - I think work by Garber and Klepper that addresses this in multivariable models. You can file this discussed in Greene's Econometric Analysis under the topic of Weak Instruments. There are tests for weak instruments - ivreg2 and xtivreg2 include some.

                      Comment

                      Working...
                      X