Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Difference-in-Differences / Event study coefficient plot

    Dear community,

    I did a ppmlhdfe regression for a triple difference-in-differences strategy. To see how the treatment effects unfold over time and to see the pre-trend, I included treatment dummies for the whole event window, so t-5,..., t, t+1,..., t+5. For the triple DiD, I interacted each treatment time dummy with a dummy variable (=1 for financial sectors). Now, I want to plot my results as e.g. in Keller & Utar (2022), Fig. 4 (see attached). I want to plot each of the treatment time dummy coefficients with their confidence interval, such that my x-axis can be a timeline. Additionally, I want to plot the coefficients on the financial interaction terms in the same plot, with a shared x-axis. With the command coefplot, I am only able to plot the pure event time dummies, but I don't see how I could layer the interaction term coefficients and use the same x-axis for both. Does someone have an idea?

    Thank you in advance.

    Best
    Noemi
    Click image for larger version

Name:	Keller_Utar.png
Views:	1
Size:	435.6 KB
ID:	1762314

  • #2
    You will increase your chances of obtaining a helpful reply by providing a reproducible example. If you cannot share your dataset or if it is too large, consider looking at the implementations of DDD in Stata's documentation and ask your question using one of Stata's sample datasets.

    Comment


    • #3
      Dear Andrew Musau

      thank you very much for your answer. The regression command I used is:

      Code:
      local gravity_sectorlevel lngdp_o lngdp_d lndistw lnsumgdp comcol col45 comlang_off lnsmp_dest t t_plus_1 t_plus_2 t_plus_3 t_plus_4 t_minus_2 t_minus_3 t_minus_4 t_minus_5 financial_t financial_t_plus_1 financial_t_plus_2 financial_t_plus_3  financial_t_plus_4 financial_t_minus_2 financial_t_minus_3 financial_t_minus_4 financial_t_minus_5
      ppmlhdfe TotalassetsthUSD `gravity_sectorlevel', absorb(year iso3_d iso3_o country_pair_encode naics2 naics2__*) d cluster(country_pair_encode)
      My output is:

      HDFE PPML regression No. of obs = 254,702
      Absorbing 280 HDFE groups Residual df = 3,423
      Statistics robust to heteroskedasticity Wald chi2(22) = 126.95
      Deviance = 7.83285e+11 Prob > chi2 = 0.0000
      Log pseudolikelihood = -3.91644e+11 Pseudo R2 = 0.8283

      Number of clusters (country_pair_encode)= 3,424
      (Std. err. adjusted for 3,424 clusters in country_pair_encode)
      -------------------------------------------------------------------------------------
      | Robust
      TotalassetsthUSD | Coefficient std. err. z P>|z| [95% conf. interval]
      --------------------+----------------------------------------------------------------
      lngdp_o | .6765901 .457133 1.48 0.139 -.2193741 1.572554
      lngdp_d | 1.648271 .3677026 4.48 0.000 .9275873 2.368955
      lndistw | 0 (omitted)
      lnsumgdp | .2850671 .5995587 0.48 0.634 -.8900464 1.460181
      comcol | 0 (omitted)
      col45 | 0 (omitted)
      comlang_off | 0 (omitted)
      lnsmp_dest | -.4195915 .558553 -0.75 0.453 -1.514335 .6751524
      t | -.2460435 .1905135 -1.29 0.197 -.619443 .1273561
      t_plus_1 | -.2669294 .178963 -1.49 0.136 -.6176903 .0838316
      t_plus_2 | -.1795388 .1410546 -1.27 0.203 -.4560008 .0969232
      t_plus_3 | -.2099838 .148699 -1.41 0.158 -.5014284 .0814609
      t_plus_4 | -.059756 .1373693 -0.44 0.664 -.3289948 .2094828
      t_minus_2 | .2546932 .1470864 1.73 0.083 -.0335909 .5429773
      t_minus_3 | .1684684 .1827159 0.92 0.357 -.1896482 .526585
      t_minus_4 | .0784418 .244383 0.32 0.748 -.40054 .5574237
      t_minus_5 | .0225643 .2182389 0.10 0.918 -.4051762 .4503047
      financial_t | .2458805 .333316 0.74 0.461 -.4074069 .8991678
      financial_t_plus_1 | .174059 .3044075 0.57 0.567 -.4225687 .7706866
      financial_t_plus_2 | -.2302709 .3156371 -0.73 0.466 -.8489081 .3883664
      financial_t_plus_3 | .1033957 .2743018 0.38 0.706 -.434226 .6410174
      financial_t_plus_4 | .0790977 .2657346 0.30 0.766 -.4417325 .5999279
      financial_t_minus_2 | -.1150269 .283329 -0.41 0.685 -.6703416 .4402878
      financial_t_minus_3 | -.0931899 .3020419 -0.31 0.758 -.6851811 .4988013
      financial_t_minus_4 | .0428981 .3582052 0.12 0.905 -.6591712 .7449675
      financial_t_minus_5 | .2522656 .3699683 0.68 0.495 -.4728589 .9773901
      _cons | -45.87423 21.27649 -2.16 0.031 -87.57539 -4.173076
      -------------------------------------------------------------------------------------

      Do these information already help?

      Best
      Noemi


      Comment


      • #4
        No, I need to be able to reproduce the regression results. See if you can create an example from the dataset below.

        Code:
        use "http://fmwww.bc.edu/RePEc/bocode/e/EXAMPLE_TRADE_FTA_DATA" if category=="TOTAL", clear
        egen imp = group(isoimp)
        egen exp = group(isoexp)
        ppmlhdfe trade fta, a(imp#year exp#year imp#exp) cluster(imp#exp)

        Comment


        • #5
          Dear Andrew Musau

          thank you for the data example. With your data I did the following:

          Code:
          use "http://fmwww.bc.edu/RePEc/bocode/e/EXAMPLE_TRADE_FTA_DATA" if category=="TOTAL", clear
          egen imp = group(isoimp)
          egen exp = group(isoexp)
          ppmlhdfe trade fta, a(imp#year exp#year imp#exp) cluster(imp#exp)
          
          
          gen t = (isoimp == "GBR" & year == 1996)
          gen t_plus_1 = ( isoimp == "GBR" & year == 2000)
          gen t_plus_2 = ( isoimp == "GBR" & year == 2004)
          gen t_minus_1 = (isoimp == "GBR" & year == 1992)
          gen t_minus_2 = (isoimp == "GBR" & year == 1988)
          
          gen fta_t = (fta == 1 & t == 1)
          gen fta_t_plus_1 = (fta == 1 & t_plus_1 == 1)
          gen fta_t_plus_2 = (fta == 1 & t_plus_2 == 1)
          gen fta_t_minus_1 = (fta == 1 & t_minus_1 == 1)
          gen fta_t_minus_2 = (fta == 1 & t_minus_2 == 1)
          
          ppmlhdfe trade ln_distw contig colony comlang_off comleg  t_minus_2 t_minus_1 t t t_plus_1 t_plus_2 fta_t_minus_2 fta_t_minus_1 fta_t fta_t_plus_1 fta_t_plus_2 ,  cluster(imp#exp)
          I did not include fixed effects as that led to collinearity omission of the event time dummies. As the data set only captured 5 years, it is a model with very few event time dummies.

          The regression yields:


          . ppmlhdfe trade ln_distw contig colony comlang_off comleg t_minus_2 t_minus_1 t t t_plus_1 t_plus_2 fta_t_minus_2 fta_t_minus_1 fta_t fta_t_plus_1 fta_t_plus_2 , cluster(imp#exp)
          note: 1 variable omitted because of collinearity: t
          Iteration 1: deviance = 5.4676e+10 eps = . iters = 1 tol = 1.0e-04 min(eta) = -1.59 P
          Iteration 2: deviance = 4.7290e+10 eps = 1.56e-01 iters = 1 tol = 1.0e-04 min(eta) = -1.82
          Iteration 3: deviance = 4.6565e+10 eps = 1.56e-02 iters = 1 tol = 1.0e-04 min(eta) = -1.86
          Iteration 4: deviance = 4.6542e+10 eps = 4.76e-04 iters = 1 tol = 1.0e-04 min(eta) = -1.94
          Iteration 5: deviance = 4.6542e+10 eps = 1.04e-06 iters = 1 tol = 1.0e-04 min(eta) = -1.94
          Iteration 6: deviance = 4.6542e+10 eps = 7.84e-12 iters = 1 tol = 1.0e-05 min(eta) = -1.94 S O
          ------------------------------------------------------------------------------------------------------------
          (legend: p: exact partial-out s: exact solver h: step-halving o: epsilon below tolerance)
          Converged in 6 iterations and 6 HDFE sub-iterations (tol = 1.0e-08)

          PPML regression No. of obs = 5,950
          Residual df = 1,189
          Statistics robust to heteroskedasticity Wald chi2(15) = 504.10
          Deviance = 4.65423e+10 Prob > chi2 = 0.0000
          Log pseudolikelihood = -2.32712e+10 Pseudo R2 = 0.2407

          Number of clusters (imp#exp)= 1,190
          (Std. err. adjusted for 1,190 clusters in imp#exp)
          -------------------------------------------------------------------------------
          | Robust
          trade | Coefficient std. err. z P>|z| [95% conf. interval]
          --------------+----------------------------------------------------------------
          ln_distw | -.1280395 .1178388 -1.09 0.277 -.3589993 .1029202
          contig | 1.861247 .3870988 4.81 0.000 1.102547 2.619947
          colony | -.1748891 .3871272 -0.45 0.651 -.9336445 .5838663
          comlang_off | .2923782 .2398907 1.22 0.223 -.1777989 .7625553
          comleg | -.0969236 .1542019 -0.63 0.530 -.3991538 .2053066
          t_minus_2 | .2464763 .4263148 0.58 0.563 -.5890854 1.082038
          t_minus_1 | .3643847 .4272125 0.85 0.394 -.4729363 1.201706
          t | .8321231 .4087671 2.04 0.042 .0309542 1.633292
          t | 0 (omitted)
          t_plus_1 | 1.362596 .3649508 3.73 0.000 .647306 2.077887
          t_plus_2 | 1.513096 .3393175 4.46 0.000 .8480462 2.178146
          fta_t_minus_2 | .4993242 .5660291 0.88 0.378 -.6100724 1.608721
          fta_t_minus_1 | .3745097 .5654629 0.66 0.508 -.7337772 1.482797
          fta_t | .0743812 .5473484 0.14 0.892 -.9984019 1.147164
          fta_t_plus_1 | -.3461088 .4899957 -0.71 0.480 -1.306483 .6142651
          fta_t_plus_2 | -.1024637 .4757111 -0.22 0.829 -1.03484 .8299129
          _cons | 15.76353 .9989984 15.78 0.000 13.80553 17.72154
          -------------------------------------------------------------------------------




          What I am looking for to do is to do a coefficient plot of the coefficients of t-2, t-1, t, t+1, t+2. So the x axis should be the event time line with a pre-event window (t-2, t-1) and a post-event window (t+1, t+2) where I see for each event time the coefficient with its confidence intervals as with:

          [CODE]coefplot, vertical drop (_cons ln_distw contig colony comlang_off comleg fta_t_minus_ 2 fta_t_minus_1 fta_t fta_t_plus_1 fta_t_plus_2) [CODE]

          which gives me this plot:

          Click image for larger version

Name:	event plot.png
Views:	1
Size:	208.8 KB
ID:	1762513


          But what I want is to overlay this plot with the coefficients for the interaction terms, so parallel to the t_minus_2, t_minus_1 coefficients, I want the coefficients for fta_t_minus_2, fta_t_minus_1 etc., such that the x-axis (the event time) is valid for both, the coefficients of the event time dummies as well as for the coefficients of the interaction terms. Does that make sense?

          I appreciate your help!


          Best
          Noemi

          Comment


          • #6
            You just need the coefficients to be named consistently. Consider a renaming "trick":

            Code:
            use "http://fmwww.bc.edu/RePEc/bocode/e/EXAMPLE_TRADE_FTA_DATA" if category=="TOTAL", clear
            egen imp = group(isoimp)
            egen exp = group(isoexp)
            ppmlhdfe trade fta, a(imp#year exp#year imp#exp) cluster(imp#exp)
            
            
            gen t = (isoimp == "GBR" & year == 1996)
            gen t_plus_1 = ( isoimp == "GBR" & year == 2000)
            gen t_plus_2 = ( isoimp == "GBR" & year == 2004)
            gen t_minus_1 = (isoimp == "GBR" & year == 1992)
            gen t_minus_2 = (isoimp == "GBR" & year == 1988)
            
            gen fta_t = (fta == 1 & t == 1)
            gen fta_t_plus_1 = (fta == 1 & t_plus_1 == 1)
            gen fta_t_plus_2 = (fta == 1 & t_plus_2 == 1)
            gen fta_t_minus_1 = (fta == 1 & t_minus_1 == 1)
            gen fta_t_minus_2 = (fta == 1 & t_minus_2 == 1)
            
            ppmlhdfe trade ln_distw contig colony comlang_off comleg  t_minus_2 t_minus_1 t t t_plus_1 t_plus_2 fta_t_minus_2 fta_t_minus_1 fta_t fta_t_plus_1 fta_t_plus_2 ,  cluster(imp#exp)
            estimates store m1
            
            *RENAME
            rename (t_* fta_t_*) (fta_t_* t_*)
            ppmlhdfe trade ln_distw contig colony comlang_off comleg  t_minus_2 t_minus_1 t t t_plus_1 t_plus_2 fta_t_minus_2 fta_t_minus_1 fta_t fta_t_plus_1 fta_t_plus_2 ,  cluster(imp#exp)
            estimates store m2
            
            *REVERSE RENAME
            rename (t_* fta_t_*) (fta_t_* t_*)
            coefplot m1 m2, keep(t*) msy(Oh) msize(1.5) vert recast(connected) ciopts(recast(rcap)) offset(0) leg(order(2 "Wanted 1" 4 "Wanted 2"))
            Click image for larger version

Name:	Graph.png
Views:	1
Size:	33.1 KB
ID:	1762544

            Comment


            • #7
              Dear Andrew Musau

              thank you so much, that was exactly what I wanted to do and it works also with my own regression. One question left: Do you know by any chance whether it is possible to make the line that connects the coefficients of m2 dotted while leaving the line for m1 continuous? By "lpattern(dot)", I can just make both lines dotted...

              Best
              Noemi

              Comment


              • #8
                I have a dashed line instead of a dotted line below. You can change this as you please.

                Code:
                use "http://fmwww.bc.edu/RePEc/bocode/e/EXAMPLE_TRADE_FTA_DATA" if category=="TOTAL", clear
                egen imp = group(isoimp)
                egen exp = group(isoexp)
                ppmlhdfe trade fta, a(imp#year exp#year imp#exp) cluster(imp#exp)
                
                
                gen t = (isoimp == "GBR" & year == 1996)
                gen t_plus_1 = ( isoimp == "GBR" & year == 2000)
                gen t_plus_2 = ( isoimp == "GBR" & year == 2004)
                gen t_minus_1 = (isoimp == "GBR" & year == 1992)
                gen t_minus_2 = (isoimp == "GBR" & year == 1988)
                
                gen fta_t = (fta == 1 & t == 1)
                gen fta_t_plus_1 = (fta == 1 & t_plus_1 == 1)
                gen fta_t_plus_2 = (fta == 1 & t_plus_2 == 1)
                gen fta_t_minus_1 = (fta == 1 & t_minus_1 == 1)
                gen fta_t_minus_2 = (fta == 1 & t_minus_2 == 1)
                
                ppmlhdfe trade ln_distw contig colony comlang_off comleg  t_minus_2 t_minus_1 t t t_plus_1 t_plus_2 fta_t_minus_2 fta_t_minus_1 fta_t fta_t_plus_1 fta_t_plus_2 ,  cluster(imp#exp)
                estimates store m1
                
                *RENAME
                rename (t_* fta_t_*) (fta_t_* t_*)
                ppmlhdfe trade ln_distw contig colony comlang_off comleg  t_minus_2 t_minus_1 t t t_plus_1 t_plus_2 fta_t_minus_2 fta_t_minus_1 fta_t fta_t_plus_1 fta_t_plus_2 ,  cluster(imp#exp)
                estimates store m2
                
                *REVERSE RENAME
                rename (t_* fta_t_*) (fta_t_* t_*)
                
                coefplot (m1, keep(t*) msy(Oh) msize(1.5) ciopts(recast(rcap)) recast(connected)) ///
                (m2, keep(t*) ms(Sh) msize(1.5) ciopts(lp(dash) recast(rcap)) recast(connected) lp(dash)), ///
                vert offset(0) leg(order(2 "Wanted 1" 4 "Wanted 2"))
                Click image for larger version

Name:	Graph.png
Views:	1
Size:	38.6 KB
ID:	1762578

                Last edited by Andrew Musau; 28 Aug 2024, 13:34.

                Comment


                • #9
                  Dear Andrew Musau ,

                  awsome, thank you so much. I just need the confidence intervals to be 90% instead of 95%. Do you have an idea how this can be realized in this plot?

                  Best and thank you very much
                  Noemi

                  Comment


                  • #10
                    if you look at
                    Code:
                    h coefplot
                    you will see that one of the options (levels) is for exactly that purpose

                    Comment


                    • #11
                      Dear Rich Goldstein thank you very much for the hint!

                      Maybe also linking Andrew Musau on this post:

                      I have a very tricky adjustment to make on the plot. the coefficient of t_minus_1 has to be omitted from the regression since it should serve as the baseline. In the plot, t_minus_1 should remain on the x-axis, however the coefficeint should just be a zero without confidence interval. It is important that t_minus_1 does not just disappear completely from the plot (as it does if I simply omit the term from my regression) because then, my timeline on the x-axis would jump from t-2 to t. It is important that for t-1 there is a dot at 0 on the y-axis.

                      I tried to solve my problem including a variable in the regression (in the order of regressors between t_minus_2 and t) which is omitted (hoping that this would appear as a zero coefficient in coefplot), but the coefficient did not appear in the plot and instead, there is now the jump from t-2 to t which I had hoped to eliminate.

                      Does anyone have an idea? Thank you in advance!

                      Best
                      Noemi

                      Comment


                      • #12
                        Code:
                        use "http://fmwww.bc.edu/RePEc/bocode/e/EXAMPLE_TRADE_FTA_DATA" if category=="TOTAL", clear
                        egen imp = group(isoimp)
                        egen exp = group(isoexp)
                        ppmlhdfe trade fta, a(imp#year exp#year imp#exp) cluster(imp#exp)
                        
                        
                        gen t = (isoimp == "GBR" & year == 1996)
                        gen t_plus_1 = ( isoimp == "GBR" & year == 2000)
                        gen t_plus_2 = ( isoimp == "GBR" & year == 2004)
                        gen t_minus_1 = (isoimp == "GBR" & year == 1992)
                        gen t_minus_2 = (isoimp == "GBR" & year == 1988)
                        
                        gen fta_t = (fta == 1 & t == 1)
                        gen fta_t_plus_1 = (fta == 1 & t_plus_1 == 1)
                        gen fta_t_plus_2 = (fta == 1 & t_plus_2 == 1)
                        gen fta_t_minus_1 = (fta == 1 & t_minus_1 == 1)
                        gen fta_t_minus_2 = (fta == 1 & t_minus_2 == 1)
                        
                        ppmlhdfe trade ln_distw contig colony comlang_off comleg  t_minus_2 t_minus_1 t t t_plus_1 t_plus_2 fta_t_minus_2 fta_t_minus_1 fta_t fta_t_plus_1 fta_t_plus_2 ,  cluster(imp#exp)
                        estimates store m1
                        
                        *RENAME
                        rename (t_* fta_t_*) (fta_t_* t_*)
                        ppmlhdfe trade ln_distw contig colony comlang_off comleg  t_minus_2 t_minus_1 t t t_plus_1 t_plus_2 fta_t_minus_2 fta_t_minus_1 fta_t fta_t_plus_1 fta_t_plus_2 ,  cluster(imp#exp)
                        estimates store m2
                        
                        *REVERSE RENAME
                        rename (t_* fta_t_*) (fta_t_* t_*)
                        
                        coefplot (m1, keep(t*) msy(Oh) msize(1.5) ciopts(recast(rcap)) recast(connected)) ///
                        (m2, keep(t*) ms(Th) msize(1.5) ciopts(lp(dash) recast(rcap)) recast(connected) lp(dash)), ///
                        vert offset(0) leg(order(2 "Wanted 1" 4 "Wanted 2")) transform(*t_minus_1= @*0) 
                        Click image for larger version

Name:	Graph.png
Views:	1
Size:	26.2 KB
ID:	1763365

                        Comment


                        • #13
                          Dear Andrew Musau

                          thank you for your response. Although the code works, I see that now, when I include t_minus_1 in the regression, one of the variables is omitted by Stata due to collinearity. It automatically omits t_plus_2 in my case. It is really tricky because the reason why I want to force the coefficient of t_minus_1 to be zero is exactly because t-1 should be omitted from the regression to serve as the reference category. But it seems that I need to include it to be able to have it in the coefplot. Do you have any other idea?

                          Thank you and best wishes
                          Noemi

                          Comment


                          • #14
                            You can use the "o." operator to omit a level, even if including the indicators manually. The following illustrates:

                            Code:
                            sysuse auto, clear
                            tab rep78, gen(repair)
                            regress mpg repair1 repair2 o.repair3 repair4 repair5, robust
                            Res.:

                            Code:
                            . tab rep78, gen(repair)
                            
                                 Repair |
                            record 1978 |      Freq.     Percent        Cum.
                            ------------+-----------------------------------
                                      1 |          2        2.90        2.90
                                      2 |          8       11.59       14.49
                                      3 |         30       43.48       57.97
                                      4 |         18       26.09       84.06
                                      5 |         11       15.94      100.00
                            ------------+-----------------------------------
                                  Total |         69      100.00
                            
                            .
                            . regress mpg repair1 repair2 o.repair3 repair4 repair5, robust
                            
                            Linear regression                               Number of obs     =         69
                                                                            F(4, 64)          =       2.72
                                                                            Prob > F          =     0.0370
                                                                            R-squared         =     0.2348
                                                                            Root MSE          =     5.2897
                            
                            ------------------------------------------------------------------------------
                                         |               Robust
                                     mpg | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
                            -------------+----------------------------------------------------------------
                                 repair1 |   1.566667   2.333959     0.67   0.504    -3.095953    6.229287
                                 repair2 |  -.3083333   1.503803    -0.21   0.838    -3.312525    2.695858
                                 repair3 |          0  (omitted)
                                 repair4 |   2.233333    1.40478     1.59   0.117    -.5730381    5.039705
                                 repair5 |   7.930303   2.718488     2.92   0.005     2.499498    13.36111
                                   _cons |   19.43333   .7718833    25.18   0.000     17.89132    20.97535
                            ------------------------------------------------------------------------------

                            Comment


                            • #15
                              Dear Andrew Musau

                              okay, thank you, good to know. However, the variable I omit with o. does then disappear from the coefplot. But I still want to include it with a coefficient value of 0...

                              Best
                              Noemi

                              Comment

                              Working...
                              X