Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problems with first-stage estimates in Fuzzy RD with rdrobust

    Hi everyone,

    I'm using rdrobust to get fuzzy regression discontinuity estimators to analyze causal effect of a cash transfer program with a household survey of one year in particular. As rdrobust with the fuzzy option shows estimates for 2 stage regression, I mainly have 2 questions or problems dealing with:
    Code:
     rdrobust depvar runvar [if] [in] [, c(#) fuzzy(fuzzyvar)
    My data is a household survey in which my outcome variable is CS1914, my running or forcing variable is score14 which is an index by which treatment is assigned according to the rules of the cash transfer program using a cutoff at 28.20351 (below the cutoff people are consider eligible to the program, and above not), and my treatment variable is hog_benef.

    Problem 1:

    I've been running the following code to get the RD estimators, but, as it follows, I get non-significant coefficients for the first-stage regression.
    Code:
    . rdrobust CS1914 score14, fuzzy(hog_benef) c(28.20351) all 
    
    Fuzzy RD estimates using local polynomial regression.
    
     Cutoff c = 28.20351 | Left of cRight of c            Number of obs =      28676
    -------------------+----------------------            BW type       =      mserd
         Number of obs |      5711       22965            Kernel        = Triangular
    Eff. Number of obs |      3046        3982            VCE method    =         NN
        Order est. (p) |         1           1
        Order bias (q) |         2           2
           BW est. (h) |     6.854       6.854
           BW bias (b) |    12.537      12.537
             rho (h/b) |     0.547       0.547
    
    First-stage estimates. Outcome: hog_benef. Running variable: score14.
    --------------------------------------------------------------------------------
                Method |   Coef.    Std. Err.    z     P>|z|    [95% Conf. Interval]
    -------------------+------------------------------------------------------------
          Conventional |  .01233     .02517   0.4900   0.624   -.037001      .061666
        Bias-corrected |  .02111     .02517   0.8387   0.402   -.028223      .070444
                Robust |  .02111      .0287   0.7355   0.462   -.035144      .077364
    --------------------------------------------------------------------------------
    
    Treatment effect estimates. Outcome: CS1914. Running variable: score14. Treatment Status: hog_benef.
    --------------------------------------------------------------------------------
                Method |   Coef.    Std. Err.    z     P>|z|    [95% Conf. Interval]
    -------------------+------------------------------------------------------------
          Conventional | -8.3531     18.752   -0.4454  0.656   -45.1071       28.401
        Bias-corrected | -3.0851     18.752   -0.1645  0.869   -39.8392      33.6689
                Robust | -3.0851     21.393   -0.1442  0.885   -45.0142      38.8439
    --------------------------------------------------------------------------------
    I wonder if anyone could help me to interpret or correct this since I've been also trying to figure out why the running variable isn't a good instrument to identify treatment status. As I can show you I've been running OLS regressions and correlations to see if score14 isn't really a good predictor of hog_benef:
    Code:
    . reg hog_benef score14
    
          Source |       SS           df       MS      Number of obs   =    28,676
    -------------+----------------------------------   F(1, 28674)     =  11209.88
           Model |  1838.31955         1  1838.31955   Prob > F        =    0.0000
        Residual |  4702.27952    28,674  .163991055   R-squared       =    0.2811
    -------------+----------------------------------   Adj R-squared   =    0.2810
           Total |  6540.59907    28,675  .228094126   Root MSE        =    .40496
    
    ------------------------------------------------------------------------------
       hog_benef |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
         score14 |  -.0159688   .0001508  -105.88   0.000    -.0162645   -.0156732
           _cons |   1.035558   .0068851   150.41   0.000     1.022063    1.049054
    ------------------------------------------------------------------------------
    
    . corr hog_benef score14
    (obs=28,676)
    
                 | hog_be~f  score14
    -------------+------------------
       hog_benef |   1.0000
         score14 |  -0.5302   1.0000
    So far score14 seems to be a good predictor of hog_benef but I also tried by restricting this to the bandwidth considered in the RD estimate:
    Code:
    . reg hog_benef score14 if (score14>=21.34951 & score14<=35.05751)
    
          Source |       SS           df       MS      Number of obs   =     7,028
    -------------+----------------------------------   F(1, 7026)      =    175.37
           Model |  39.7971642         1  39.7971642   Prob > F        =    0.0000
        Residual |  1594.40247     7,026  .226928902   R-squared       =    0.0244
    -------------+----------------------------------   Adj R-squared   =    0.0242
           Total |  1634.19963     7,027  .232560073   Root MSE        =    .47637
    
    ------------------------------------------------------------------------------
       hog_benef |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
         score14 |  -.0193368   .0014602   -13.24   0.000    -.0221991   -.0164744
           _cons |   1.188693   .0424057    28.03   0.000     1.105565    1.271821
    ------------------------------------------------------------------------------
    
    . corr hog_benef score14 if (score14>=21.34951 & score14<=35.05751)
    (obs=7,028)
    
                 | hog_be~f  score14
    -------------+------------------
       hog_benef |   1.0000
         score14 |  -0.1561   1.0000
    But again I get the same evidence, the running variable might be a good predictor of treatment status, so why isn't this the same for the first-stage of the RD estimation?

    Problem 2:

    Can anyone help me with an idea of how can I recover first-stage estimates and model statistics such as F-value to report with other commands such as esttab or outreg2?

    According to the help file of rdrobust there is no option in which one could ask stata to report model statistics, and I also do not find an e() matrix or scalar in which to recover coefficients and statistics of the first-stage.

    Thank you very much for any of your answers or ideas,

    Freddy.


  • #2
    Hi Freddy Hernandez. Have you solved your problem? I am also facing with the same problem.

    Comment


    • #3
      Hi, I don't know if this is still relevant to you. Rdrobust uses triangular kernel weights as a default. So, to have comparable regressions, you either need to estimate rdrobust with the option kernel(uniform), or you need to add weights to your reg command. According to https://stats.stackexchange.com/ques...uzzy-rdd-issue, you can use
      Code:
      gen w=max(0,1-abs(cutoff_Y))
      as a triangular weight for your reg estimation. Also, I suggest you use cmogram to do a visual inspection of your first stage to help find out what's going on in your data.

      I am also looking for a way to report the first stage in a table. Have you found any solution to that problem?

      Comment


      • #4
        Hello Freddy,

        I have a similar problem. Did you find a solution?

        Best regards,

        Comment


        • #5
          Hello Freddy,
          I find it might be useful if you set the triangular weight as
          Code:
          gen w = max(0, `bandwidth' - abs(running_var))
          where the `bandwidth' can be retrieved from `rdbwselect` and running_var is your running variable.

          The results I got is closer to the first-stage estimate (both coefficient and s.e.) reported by -rdrobust-. Though I'm not exactly sure if this is a coincidence or how it works under the hood.

          Might also be helpful to look at https://www.stata.com/statalist/arch.../msg01198.html . In that post, -rd- is another user-contributed command for RDD. The first-stage coefficient (to be clear, the coefficient of the dummy indicating the running variable is above the cutoff value) is called "denom" if you use -rd-.

          And thanks Katrin for the insightful hint!
          Last edited by Celia Zhu; 18 Aug 2020, 13:14.

          Comment

          Working...
          X