Instrumental Variables and Endogeneity

Joseph Simpson

Join Date: Aug 2016

Posts: 4
#1

Instrumental Variables and Endogeneity

30 Aug 2016, 14:14

If researchers have difficulty in finding instrumental variables, and the instrument's strength plays a crucial role in ensuring bias resulting from endogeneity is mitigated, could you just create a simulated instrument since the simulated variable (or instrument) isn't of primary concern for interpretation? If this is acceptable, how might I create such a variable in STATA with previously collected data?
Tags: None
Anat Tchetchik

Join Date: Jun 2014

Posts: 217
#2

30 Aug 2016, 15:23

Personally I never heard about the possibility you are raising. In some instances, if there is no appropriate instrument, consider using a lag of the endogenous independent variable.
Comment
Maya Lani

Join Date: Apr 2015

Posts: 51
#3

31 Aug 2016, 03:39

What do you exactly mean by creating a simulated instrument?

As Anatmanes mentioned you can use a lag of the endogenous variable as an instrument. However, I personally don't find lags to be very good instruments because they are mostly correlated with the error term.
Comment
Tim Grünebaum

Join Date: Aug 2014

Posts: 49
#4

31 Aug 2016, 04:29

I don't think what you seek is possible.
In my opinion an equivalent instrument comes from theory. So you cannot just pick some numbers which have high correlation with your variable of interest because you have to ensure that your instrument affects the LHS only via the RHS and being uncorrelated with the error. By "creating" an instrument you cannot say that.

So "creating" an instrument has to come somehow from an existing variable where you have a theory how it might interact with your system of equations.
You could use something like scaling or multiplication with other variables or scalars, but the final product has to be an interpretable variable I suppose.
Comment
Ngawang Dendup

Join Date: Nov 2015

Posts: 13
#5

31 Aug 2016, 04:32

As far as I know, I have not heard about this method and also I think this is not a good strategy. While your results may be biased due to failure of exogeneity assumption, in such cases, this is general tendency to warn the readers to interpret results cautiously as likely endogeneity issue is not addressed in your paper. Also if you are looking alternative suggestions, I would strongly encourage you to provide details about your model, why you think there is endogeneity problem and statistics of your data.
Comment
Joseph Simpson

Join Date: Aug 2016

Posts: 4
#6

02 Sep 2016, 07:22

I'd like to thank everyone for the feedback.

Maya, that's my concern with using a lag. If the instrument isn't very good, then there could be bias still in the results. By simulate, I mean you create the variable and data. For example, I could make a create data that correlates with the x, but not the dv, other ivs, or error term. But, based on what people are saying that doesn't seem like a good idea .

Ngawang, this is a general question, not one applied to a particular data set. So, I don't have a model or data to provide for it.
1 like
Comment
Jimmy Yang

Join Date: May 2015

Posts: 54
#7

03 Sep 2016, 23:08

If endogenous dependent variable is affecting error term only through non-randomly selected samples, you can use Heckman model.
If you are not in academia, you can arbitrary split the observation into two group to simulate to find the most likely splitting of sample to create a binary instrument variable to mitigate the endogenous problem. Whereas this approach may work in business analytics, it is not suited for academic research.
I use this routine, but I am in real estate, not in university.
Comment
Jimmy Yang

Join Date: May 2015

Posts: 54
#8

03 Sep 2016, 23:41

What I mean is that if there id a dummy D causing endogenous problem, which is unknown; you can simulate D until it reach the best fitting in Heckman model as the best postulated splitting to mitigate endogenous problem.

I paste the related info from http://stats.stackexchange.com/quest...sample-selecti

Here:

One should make a distinction between the specific Heckman sample selection model (where only one sample is observed) and Heckman-type corrections for self-selection, which can also work for the case where the two samples are observed. The latter is referred to as control functionapproach, and amounts to include into your second stage a term controlling for the endogeneity.

Let us have a standard case with an endogeneous dummy variable D, an instrument Z:

Y=β+β1D+ϵ
D=γ+γ1Z+u

Both approaches run a first stage (D on Z). IV uses a standard OLS (even if D is a dummy) Heckman uses a probit. But besides this, the main difference is on the way they use this first stage into the main equation:
IV: break the endogeneity by decomposing D into parts uncorrelated with ϵ, given by the prediction of D: Y=β+β1ˆD+ϵ

Heckman: model the endogeneity: keep the endogenous D, but add a function of the predicted values of the first stage. For this case, it is a pretty complicated function: Y=β+β1D+β2[λ(ˆD)−λ(−ˆD)]+ϵ where λ() is the inverse Mills ratio

The advantage of the Heckman procedure is that it provides a direct test for endogeneity: the coefficient β2. On the other side, the Heckman procedure relies on the assumption of joint normality of the errors, while the IV does not make any such assumption.

So you have the standard story that with normal errors, the control function will be more efficient (especially if ones uses the MLE instead of the two-step shown here) than the IV, but that if the assumption does not hold, IV would be better. As researchers have become more suspicious about the assumption of normality, the IV is used more often.
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2148
#9

04 Sep 2016, 11:08

Simulating an IV won't work, at least not what you're suggesting. If the IV is randomly generated, it will be exogenous but not correlated with X. If you do something like Z = X + V, where V is independent of everything, then Z is endogenous because X is.
Comment

Announcement

Instrumental Variables and Endogeneity

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment