Dear all,
it's my first time posting here, so please let me know, if there is any information missing or unclear. For my master thesis, I am examining an antipoverty transfer using a fuzzy regression discontinuity design (RDD). In this program, a household is eligible if it has a poverty score above a certain threshold. The score does not perfectly determine the treatment, which is why I use the fuzzy design. As Lee & Lemieux (2010) note, a fuzzy RDD is the same as an instrumental variable (IV) approach. In the first stage, treatment is predicted by eligbility. In the second stage, the outcome is estimated by the predicted treatment. The treatment effect can also be obtained by dividing the reduced form by the first stage. The reduced form is the outcome predicted by eligibility.
I want to make a plot of the outcome variable and the running variable to illustrate the treatment effect. This is the same as a plot of the second stage in an IV approach. At best, I would like to create a plot, that depicts the robust coefficient from the command rdrobust (from https://sites.google.com/site/rdpackages/rdrobust, I use the latest version of 2019 in Stata 15). More specifically, this command gives a conventional coefficient, a bias-corrected one, and a robust one. In my case the conventional differs in sign and size from the other two. How do I depict the robust coefficient in a plot? Any help in getting any step further is highly appreciated.
What I have tried so far:
While rdrobust contains an option for the fuzzy regression, which allows to include the actual treatment variable, the accompanying command rdplot does not. Rdrobust does not save any results from the first stage regression, so I cannot use that manually for rdplot.
I have tried to do the first stage "by hand" and then predict the probability of treatment:
where
tm is the treatment variable, being 1 if the household is treated and 0 otherwise,
eligible is 1 if the household is eligible for the treatment and 0 otherwise,
score_c is the poverty score, centered around 0, so that all households with a score above 0 are eligible,
the interaction term allows the slopes to vary around the cutoff at 0,
w is a weight to apply a triangular kernel weighting.
With this regression I get the same first stage coefficients as in rdrobust with slightly different standard errors. Using probit instead of regress gives the same values for tm_hat. Yet, the variable tm_hat is continuous and I do not know how to proceed to use it for rdplot. My supervisor has indicated to me that I need this variable to be dichotomous to be able to use it with rdplot, but in the first try she has not found a way how to do that.
I have used rdplot limiting the command to only compliers, given that the fuzzy RDD always examines the effect on compliers. However, then I use only up to half of the observations, while the rdrobust command uses all observations. The discontinuity is not of the same size as any of the coefficients in the rdrobust regression.
The regressions with other commands, such as ivregress or regress yield the conventional coefficient from rdrobust, or at least a very similar one. I put my codes for those down below. It might be possible to apply a bias-correction and to do the robust standard errors by hand, but I do not know how.
where ln_va is my outcome variable log of value added.
The second method yields the same coefficient as ivregress, however standard errors are quite different. I have read in a forum (I can't remember where exactly) that the SE are wrong in this case, because Stata uses tm_hat1 to compute them, however tm needs to be used for the SE in the fuzzy RDD.
By the way, the rd command from ssc is not useful for me in this case as it only calculates the first stage and the reduced form, not the second stage.
I am grateful for any advice. Maybe it is rather unpopular to say, but if anybody has an idea on how to do the graph in R, this advice is also appreciated.
Thank you very much in advance,
Katrin
References:
Lee, David S.; Lemieux, Thomas (2010): Regression Discontinuity Designs in Economics. In Journal of Economic Literature 48 (2), pp. 281–355.
it's my first time posting here, so please let me know, if there is any information missing or unclear. For my master thesis, I am examining an antipoverty transfer using a fuzzy regression discontinuity design (RDD). In this program, a household is eligible if it has a poverty score above a certain threshold. The score does not perfectly determine the treatment, which is why I use the fuzzy design. As Lee & Lemieux (2010) note, a fuzzy RDD is the same as an instrumental variable (IV) approach. In the first stage, treatment is predicted by eligbility. In the second stage, the outcome is estimated by the predicted treatment. The treatment effect can also be obtained by dividing the reduced form by the first stage. The reduced form is the outcome predicted by eligibility.
I want to make a plot of the outcome variable and the running variable to illustrate the treatment effect. This is the same as a plot of the second stage in an IV approach. At best, I would like to create a plot, that depicts the robust coefficient from the command rdrobust (from https://sites.google.com/site/rdpackages/rdrobust, I use the latest version of 2019 in Stata 15). More specifically, this command gives a conventional coefficient, a bias-corrected one, and a robust one. In my case the conventional differs in sign and size from the other two. How do I depict the robust coefficient in a plot? Any help in getting any step further is highly appreciated.
What I have tried so far:
While rdrobust contains an option for the fuzzy regression, which allows to include the actual treatment variable, the accompanying command rdplot does not. Rdrobust does not save any results from the first stage regression, so I cannot use that manually for rdplot.
I have tried to do the first stage "by hand" and then predict the probability of treatment:
Code:
reg tm eligible score_c c.score_c#i.eligible [pw=w] predict double tm_hat
tm is the treatment variable, being 1 if the household is treated and 0 otherwise,
eligible is 1 if the household is eligible for the treatment and 0 otherwise,
score_c is the poverty score, centered around 0, so that all households with a score above 0 are eligible,
the interaction term allows the slopes to vary around the cutoff at 0,
w is a weight to apply a triangular kernel weighting.
With this regression I get the same first stage coefficients as in rdrobust with slightly different standard errors. Using probit instead of regress gives the same values for tm_hat. Yet, the variable tm_hat is continuous and I do not know how to proceed to use it for rdplot. My supervisor has indicated to me that I need this variable to be dichotomous to be able to use it with rdplot, but in the first try she has not found a way how to do that.
I have used rdplot limiting the command to only compliers, given that the fuzzy RDD always examines the effect on compliers. However, then I use only up to half of the observations, while the rdrobust command uses all observations. The discontinuity is not of the same size as any of the coefficients in the rdrobust regression.
The regressions with other commands, such as ivregress or regress yield the conventional coefficient from rdrobust, or at least a very similar one. I put my codes for those down below. It might be possible to apply a bias-correction and to do the robust standard errors by hand, but I do not know how.
Code:
ivregress 2sls ln_va (tm=i.eligible) score_c c.score_c#i.eligible [pw=w] if score_c<0.117 & score_c>-0.117, vce(robust)
Code:
reg tm eligible score_c c.score_c#i.eligible [pw=w] predict double tm_hat1 regress ln_va tm_hat score_c c.score_c#i.eligible [pw=w] if score_c<0.117 & score_c>-0.117
The second method yields the same coefficient as ivregress, however standard errors are quite different. I have read in a forum (I can't remember where exactly) that the SE are wrong in this case, because Stata uses tm_hat1 to compute them, however tm needs to be used for the SE in the fuzzy RDD.
By the way, the rd command from ssc is not useful for me in this case as it only calculates the first stage and the reduced form, not the second stage.
I am grateful for any advice. Maybe it is rather unpopular to say, but if anybody has an idea on how to do the graph in R, this advice is also appreciated.
Thank you very much in advance,
Katrin
References:
Lee, David S.; Lemieux, Thomas (2010): Regression Discontinuity Designs in Economics. In Journal of Economic Literature 48 (2), pp. 281–355.
Comment