Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Diff-in-diff with panel data, fixed-effect estimation and "faking" the baseline difference

    I'm estimating a difference-in-differences model using panel data. I have two waves of data (pre-treatment and post-treatment) and two groups (treated and untreated). Treatment occurs between the two waves, only for the "treated" group.

    This is relatively straightforward to estimate with a fixed effect model:

    Code:
    . xtset xwaveid wave
    
    . xtreg outcome treat##wave, fe
    note: 1.treat omitted because of collinearity
    
    Fixed-effects (within) regression               Number of obs     =      5,716
    Group variable: xwaveid                         Number of groups  =      3,094
    
    R-sq:                                           Obs per group:
         within  = 0.0068                                         min =          1
         between = 0.0016                                         avg =        1.8
         overall = 0.0002                                         max =          2
    
                                                    F(2,2620)         =       8.90
    corr(u_i, Xb)  = -0.0783                        Prob > F          =     0.0001
    
    ---------------------------------------------------------------------------------
            outcome |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    ----------------+----------------------------------------------------------------
              treat |
           Treated  |          0  (omitted)
                    |
               wave |
            Wave 2  |    .055467   .0146629     3.78   0.000     .0267148    .0842191
                    |
         treat#wave |
    Treated#Wave 2  |  -.0708387   .0168099    -4.21   0.000    -.1038008   -.0378766
                    |
              _cons |    .171515   .0047576    36.05   0.000     .1621859     .180844
    ----------------+----------------------------------------------------------------
            sigma_u |  .21878326
            sigma_e |  .25962105
                rho |    .415255   (fraction of variance due to u_i)
    ---------------------------------------------------------------------------------
    F test that all u_i=0: F(3093, 2620) = 1.23                  Prob > F = 0.0000
    So far so good. However, in a prior study that did NOT have panel data, I estimated DID models and then constructing nifty figures that sought to show the estimated effect, along with the parallel paths assumption. Those figures look like this:

    Click image for larger version

Name:	didplot.png
Views:	1
Size:	48.9 KB
ID:	1454966

    I'd like to produce similar charts in the current analysis; however, with the fixed-effect estimation the wave-1 point estimates are not identified. My question is, would the following be a reasonable (or semi-reasonable) way to recover estimates of those wave-1 levels in order to produce the figure:

    Code:
    . predict xbu, xbu
    
    . table treat if wave==1, c(mean xbu)
    
    ----------------------
        treat |  mean(xbu)
    ----------+-----------
    Untreated |   .1241663
      Treated |   .1877528
    ----------------------
    This produces the same wave-1 estimates that I get if I simply ignore the panel nature of the data:

    Code:
    . reg outcome treat##wave, cluster(xwaveid )
    
    Linear regression                               Number of obs     =      5,716
                                                    F(3, 3093)        =      13.16
                                                    Prob > F          =     0.0000
                                                    R-squared         =     0.0057
                                                    Root MSE          =     .27429
    
                                   (Std. Err. adjusted for 3,094 clusters in xwaveid)
    ---------------------------------------------------------------------------------
                    |               Robust
            outcome |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    ----------------+----------------------------------------------------------------
              treat |
           Treated  |   .0635864   .0104371     6.09   0.000     .0431221    .0840507
                    |
               wave |
            Wave 2  |   .0588923   .0138013     4.27   0.000     .0318317     .085953
                    |
         treat#wave |
    Treated#Wave 2  |  -.0769198   .0159763    -4.81   0.000    -.1082451   -.0455946
                    |
              _cons |   .1241663   .0085618    14.50   0.000      .107379    .1409536
    ---------------------------------------------------------------------------------
    
    . margins treat#wave
    
    Adjusted predictions                            Number of obs     =      5,716
    Model VCE    : Robust
    
    Expression   : Linear prediction, predict()
    
    -----------------------------------------------------------------------------------
                      |            Delta-method
                      |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
    ------------------+----------------------------------------------------------------
           treat#wave |
    Untreated#Wave 1  |   .1241663   .0085618    14.50   0.000      .107379    .1409536
    Untreated#Wave 2  |   .1830587    .011172    16.39   0.000     .1611534    .2049639
      Treated#Wave 1  |   .1877527    .005969    31.45   0.000     .1760492    .1994563
      Treated#Wave 2  |   .1697253   .0060542    28.03   0.000     .1578546     .181596
    -----------------------------------------------------------------------------------
    But this feels like cheating. More specifically, is this in general a valid, or semi-valid approach?

    Thanks!
Working...
X