Hi,
I am working on a regression where I am wondering what the relationship between X and Y is. I am including some controls Z. I know not including controls that are determined before the treatment can lead to omitted variables bias. The other way around can lead to bad controls and I am exactly wondering if I should be worried about this.
Let’s say that X impacts Y positively. A higher value of X leads to a higher value of Y.
I then add control Z. Z can be considered a bad control because X impacts Y through Z. If Z is then positively correlated with X and Y, and the coefficient of X is insignificant, we could say that the true effect is perhaps underestimated. Since Z is perhaps picking up the effect.
What would be the case however be if the relationship between X and Z is negative (or the relationship between Z and Y is negative) while the relationship between X and Y stays positive? Could we then say that the effect is overestimated? Just like with omitted variable bias?
I am working on a regression where I am wondering what the relationship between X and Y is. I am including some controls Z. I know not including controls that are determined before the treatment can lead to omitted variables bias. The other way around can lead to bad controls and I am exactly wondering if I should be worried about this.
Let’s say that X impacts Y positively. A higher value of X leads to a higher value of Y.
I then add control Z. Z can be considered a bad control because X impacts Y through Z. If Z is then positively correlated with X and Y, and the coefficient of X is insignificant, we could say that the true effect is perhaps underestimated. Since Z is perhaps picking up the effect.
What would be the case however be if the relationship between X and Z is negative (or the relationship between Z and Y is negative) while the relationship between X and Y stays positive? Could we then say that the effect is overestimated? Just like with omitted variable bias?
Comment