Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Instrumental variables with cohort interactions and clustering produces negative standard errors

    Dear all,

    I am trying to estimate the following model: Y = a_0 + a_1 X + a_2 C + a_3 (X*C) + e, where X is a continuous endogenous variable and C is a age cohort dummy variable. I have an instrument for X, denoted by Z. Elsewhere, it has been noted that Z*C is a valid instrument for X*C. I want to cluster the standard errors at a region level.

    I am using the "ivreg2" command in Stata to estimate the model, but it warns that the standard errors are not correct.

    I did the first stage regressions manually by running the following regressions:
    reg X Z (Z*C) C, cluster(region)------(I)
    reg (X*C) Z (Z*C) C, cluster(region) ----------(II)

    The problem seems to be that regression (II) produces a negative variance covariance matrix. In particular, the diagonal term in the variance covariance matrix corresponding to variable Z is negative and close to zero (the value is -2.976e-18). Standard errors for Z*C and C is is valid. Without the cluster option, both coefficient estimates and standard error for Z is positive and close to zero. My own take is that clustering is pushing the standard errors slightly towards negative, which is leading to the problem with the ivreg2 command.

    Any suggestion on how I can address this issue? Thank you in advance for your help!

    Rashesh

    Additional information:
    1. ivreg2 command that I run: ivreg2 (X X*C = Z Z*C) C, cluster(region)
    2. The instrument Z is measured at the region level.

  • #2
    Rashesh: not quite enough information here. What do you mean when you say ivreg2 "warns that the standard errors are not correct"? Can you give us the exact error message (and, ideally, the actual output)? Also, what versions of Stata and ivreg2 are you using?

    Comment


    • #3
      Dear Mark,

      Thank you for your inquiry. The warning I get is that "estimated covariance matrix of moment conditions not of full rank. standard errors and model tests should be interpreted with caution". I have attached a PDF with the output.

      The relevant variables are:
      Depdendent variable: "aboveSLC"
      Endogenous variable: "endog"
      Endogenous variable interacted with cohort dummy: "endogx1"
      Instrument: "instr1"
      Instrument interacted with cohort dummy "instr1x1"
      Cohort dummy: "cohortdum2
      Region variable: "ilakacode"

      After running ivreg2, I try running OLS regression manually as I mentioned in my original post. The file also contains results from those regression and the variance covariance matrix. I also discovered a peculiarity in Stata's "reg" command: I (accidentally) changed the ordering of the independent variables in regression (II) above, and I get a positive standard error! I redid the steps couple of times to make sure nothing else was different...

      Stata version is 13.1, ivreg2 version 02.2.08.

      Thank you again, Mark! I hope you can provide some insights.

      Attached Files

      Comment


      • #4
        Rashesh: you're using a very old version of ivreg2. The latest is version 4.1.09. Maybe it's the version that is associated with the 2007 SJ article we wrote? This is the first to come up when you do a search for ivreg2, but the most up-to-date version is later in the list. You can install the latest version with ssc install ivreg2, replace.

        Anyway ... the message from ivreg2 is only a warning and it doesn't say that "the standard errors are not correct". Rather, it warns you that they might be suspect and you should probably investigate. There is a bit of discussion of this in the ivreg2 help file or our 2007 SJ paper (can't recall if it's in both). A related discussion is in a Stata help file: see help j_robustsingular.

        Comment


        • #5
          Thank you, Mark. The new version seems to be working better, but I still get lot of "missing" standard errors in first stage regression. The issue is that the first stage regressions are essentially the same regression, just for different cohorts. Would it be better to do just one first-stage regression, get predicted value of the endogenous and use it in the second stage regression along with interaction? I would also need to correct the standard errors...

          Comment


          • #6
            As I understand what you've done, there is no obvious problem with the actual IV estimation itself - there are no error messages or warnings, the model is exactly identified, there is nothing suspicious about the under- or weak-identification tests, the SEs reported with the main IV estimation look OK. So I don't see any clear reason to start mucking around with doing 2SLS by hand.

            What's bothering you, I think, is that one of the two individual first-stage regressions is missing cluster-robust SEs at least some of the time. You're using interactions for different cohorts, and without knowing much about the detail of your data and setup, I'm not too surprised that you're getting this. It doesn't mean anything is seriously wrong, and you may just need to work out why one of the two first-stage regressions does what you expect and the other doesn't. Try doing things like running the first-stage regressions by hand without and without cluster-robust SEs - maybe the issue is the cluster-robust SEs and not the regression itself.

            Comment

            Working...
            X