Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Theoretical question: Different results from -diff- command and -xtdidregress- command in Stata 17.0

    Hello,

    This is more of a theoretical question. I am looking at the impact of a reform on recidivism rates. My dataset is of about 181,000+ observations. I have copied part of it below.

    When I run reg, xtreg, and diff (user-defined) commands (all the codes written down below) on Stata 17.0, I get similar results for the did variable(0.0314). When I run xtdidregress command, my results are of the opposite sign and much larger values for did variable(-0.1927). These are contradictory results at the first glance. My understanding was that both -diff- and -xtdidregress- calculated ATET effects for panel data. Is there something I'm understanding completely wrong about these commands and what they do? Or are the differences arising due to the way I'm specifying my commands?

    Variable NYC is an indicator variable that is 1 for the treatment group and 0 for comparison group.
    Variable period is an indicator variable that is 1 for post-reform years and 0 for pre-reform years.
    Variable did is the interaction variable between NYC and period.

    P.S: I am currently running these commands without any covariates, robustness or fixed effects. I just wanted to understad the difference between the results of diff and xtdidregress command.

    Commands I'm running:
    Code:
    reg recid NYC period did
    diff recid, t(NYC) p(period)
    xtset id
    xtreg recid NYC period did
    xtdidregress (recid) (did), group(id) time(period)
    Data:
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float recid long id float year int courtid float(NYC period did) byte age_at_referral float(female black months_disposition crime sentence)
    0 17944 2007 93 0 0 0 16 0 0 5 1 1
    0 21153 2005 93 0 0 0 16 0 0 3 1 0
    0 22210 2005 93 0 0 0 15 1 1 3 0 0
    0 28858 2005 93 0 0 0 16 0 0 2 0 1
    0 32454 2005 93 0 0 0 16 1 0 1 0 0
    0 5183618 2006 106 1 0 0 15 0 0 6 1 1
    0 5184303 2006 106 1 0 0 15 0 0 . . .
    0 5184596 2005 106 1 0 0 15 1 0 7 1 0
    0 5188710 2006 106 1 0 0 15 0 0 3 0 0
    0 5189440 2005 106 1 0 0 16 0 0 3 0 0
    0 58956 2015 93 0 1 0 15 1 1  1 1 1
    0 66815 2013 93 0 1 0 16 0 0  8 0 1
    0 74530 2014 93 0 1 0 14 0 0 10 0 1
    0 83336 2013 93 0 1 0 14 0 0  4 1 1
    0 83664 2017 93 0 1 0 16 0 0  8 0 1
    0 5554707 2013 106 1 1 1 16 0 0 12 1 1
    0 5613580 2013 106 1 1 1 16 1 1  3 1 0
    0 5623310 2013 106 1 1 1 15 0 0 18 1 1
    0 5639877 2015 106 1 1 1 15 0 0  7 1 1
    0 5647375 2014 106 1 1 1 15 0 1  2 1 0
    end
    label values courtid labels5
    label def labels5 93 "Dutchess", modify
    label values age_at_referral labels4
    label def labels4 11 "Eleven", modify
    label def labels4 12 "Twelve", modify
    label def labels4 13 "Thirteen", modify
    label def labels4 14 "Fourteen", modify
    label def labels4 15 "Fifteen", modify
    label def labels4 16 "Sixteen", modify
    label def labels4 17 "Seventeen", modify
    label values age_at_referral labels4
    label def labels4 15 "Fifteen", modify
    label def labels4 16 "Sixteen", modify
    label var recid "Recidivism Non-status offenses" 
    label var id "Resp entity ID" 
    label var courtid "Court entity ID" 
    label var period "PostR" 
    label var did "PNYC X Post R" 
    label var age_at_referral "Age at referral" 
    label var female "Female" 
    label var black "Black"
    Thank you,
    Tessie

  • #2
    This is how the results look:
    Code:
    . reg recid NYC period did
    
          Source |       SS           df       MS      Number of obs   =   181,001
    -------------+----------------------------------   F(3, 180997)    =    104.00
           Model |  47.4694917         3  15.8231639   Prob > F        =    0.0000
        Residual |  27537.0954   180,997  .152141171   R-squared       =    0.0017
    -------------+----------------------------------   Adj R-squared   =    0.0017
           Total |  27584.5649   181,000  .152400911   Root MSE        =    .39005
    
    ------------------------------------------------------------------------------
           recid | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
             NYC |   .0199348   .0022296     8.94   0.000     .0155649    .0243048
          period |  -.0007579   .0025199    -0.30   0.764    -.0056969    .0041812
             did |   .0314014   .0041762     7.52   0.000     .0232161    .0395867
           _cons |   .1768617   .0014139   125.08   0.000     .1740904     .179633
    ------------------------------------------------------------------------------
    
    . diff recid, t(NYC) p(period)
    
    DIFFERENCE-IN-DIFFERENCES ESTIMATION RESULTS
    Number of observations in the DIFF-IN-DIFF: 181001
                Before         After    
       Control: 76099          34968       111067
       Treated: 51195          18739       69934
                127294         53707
    --------------------------------------------------------
     Outcome var.   | recid   | S. Err. |   |t|   |  P>|t|
    ----------------+---------+---------+---------+---------
    Before          |         |         |         | 
       Control      | 0.177   |         |         | 
       Treated      | 0.197   |         |         | 
       Diff (T-C)   | 0.020   | 0.002   | 8.94    | 0.000***
    After           |         |         |         | 
       Control      | 0.176   |         |         | 
       Treated      | 0.227   |         |         | 
       Diff (T-C)   | 0.051   | 0.004   | 14.54   | 0.000***
                    |         |         |         | 
    Diff-in-Diff    | 0.031   | 0.004   | 7.52    | 0.000***
    --------------------------------------------------------
    R-square:    0.00
    * Means and Standard Errors are estimated by linear regression
    **Inference: *** p<0.01; ** p<0.05; * p<0.1
    
    . xtreg recid NYC period did
    
    Random-effects GLS regression                   Number of obs     =    181,001
    Group variable: id                              Number of groups  =    131,042
    
    R-squared:                                      Obs per group:
         Within  = 0.0135                                         min =          1
         Between = 0.0017                                         avg =        1.4
         Overall = 0.0017                                         max =         18
    
                                                    Wald chi2(3)      =     312.01
    corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000
    
    ------------------------------------------------------------------------------
           recid | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
             NYC |   .0199348   .0022296     8.94   0.000     .0155649    .0243047
          period |  -.0007579   .0025199    -0.30   0.764    -.0056969    .0041811
             did |   .0314014   .0041762     7.52   0.000     .0232161    .0395866
           _cons |   .1768617   .0014139   125.08   0.000     .1740904     .179633
    -------------+----------------------------------------------------------------
         sigma_u |          0
         sigma_e |  .50574135
             rho |          0   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    
    . xtdidregress (recid) (did), group(id) time(period)
    
    Number of groups and treatment time
    
    Time variable: period
    Control:       did = 0
    Treatment:     did = 1
    -----------------------------------
                 |   Control  Treatment
    -------------+---------------------
    Group        |
              id |    117507      13535
    -------------+---------------------
    Time         |
         Minimum |         0          1
         Maximum |         1          1
    -----------------------------------
    
    Difference-in-differences regression                   Number of obs = 181,001
    Data type: Longitudinal
    
                                   (Std. err. adjusted for 131,042 clusters in id)
    ------------------------------------------------------------------------------
                 |               Robust
           recid | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
    ATET         |
             did |
       (1 vs 0)  |  -.1926986    .018757   -10.27   0.000     -.229462   -.1559353
    ------------------------------------------------------------------------------
    Note: ATET estimate adjusted for panel effects and time effects.

    Comment


    • #3
      Hi Tessie,

      By default -xtdidregress- fits a generalized DID or a two-way fixed effects DID. What you want is a 2x2 DID. Take a look at example 8 in the manual. It would tell you how to proceed to obtain equivalent results.

      https://www.stata.com/manuals/tedidr...f#tedidregress

      Comment


      • #4
        Hello Enrique,

        I did just what you told and the results are now identical. Thank you so much for your quick response. One last question if you don't mind. How do I differentiate between interpreting these results? That is, considering both are ATETS, would I be accurate in saying the xtdidregress results are taking into consideration the group and time fixed effects and the didregress (the 2x2 model) is the result without these fixed effects just making a pre and post comparison?

        Thanks again,
        Tessie

        Comment

        Working...
        X