Difference-in-Differences / Event study coefficient plot

Noemi Seng

Join Date: Jan 2024

Posts: 90
#1

Difference-in-Differences / Event study coefficient plot

25 Aug 2024, 06:38

Dear community,

I did a ppmlhdfe regression for a triple difference-in-differences strategy. To see how the treatment effects unfold over time and to see the pre-trend, I included treatment dummies for the whole event window, so t-5,..., t, t+1,..., t+5. For the triple DiD, I interacted each treatment time dummy with a dummy variable (=1 for financial sectors). Now, I want to plot my results as e.g. in Keller & Utar (2022), Fig. 4 (see attached). I want to plot each of the treatment time dummy coefficients with their confidence interval, such that my x-axis can be a timeline. Additionally, I want to plot the coefficients on the financial interaction terms in the same plot, with a shared x-axis. With the command coefplot, I am only able to plot the pure event time dummies, but I don't see how I could layer the interaction term coefficients and use the same x-axis for both. Does someone have an idea?

Thank you in advance.

Best
Noemi
Tags: None
Andrew Musau

Join Date: Oct 2014

Posts: 10190
#2

25 Aug 2024, 13:38

You will increase your chances of obtaining a helpful reply by providing a reproducible example. If you cannot share your dataset or if it is too large, consider looking at the implementations of DDD in Stata's documentation and ask your question using one of Stata's sample datasets.
Comment
Noemi Seng

Join Date: Jan 2024

Posts: 90
#3

27 Aug 2024, 01:04

Dear Andrew Musau

thank you very much for your answer. The regression command I used is:

Code:

local gravity_sectorlevel lngdp_o lngdp_d lndistw lnsumgdp comcol col45 comlang_off lnsmp_dest t t_plus_1 t_plus_2 t_plus_3 t_plus_4 t_minus_2 t_minus_3 t_minus_4 t_minus_5 financial_t financial_t_plus_1 financial_t_plus_2 financial_t_plus_3 financial_t_plus_4 financial_t_minus_2 financial_t_minus_3 financial_t_minus_4 financial_t_minus_5 ppmlhdfe TotalassetsthUSD `gravity_sectorlevel', absorb(year iso3_d iso3_o country_pair_encode naics2 naics2__*) d cluster(country_pair_encode)

My output is:

HDFE PPML regression No. of obs = 254,702
Absorbing 280 HDFE groups Residual df = 3,423
Statistics robust to heteroskedasticity Wald chi2(22) = 126.95
Deviance = 7.83285e+11 Prob > chi2 = 0.0000
Log pseudolikelihood = -3.91644e+11 Pseudo R2 = 0.8283

Number of clusters (country_pair_encode)= 3,424
(Std. err. adjusted for 3,424 clusters in country_pair_encode)
-------------------------------------------------------------------------------------
| Robust
TotalassetsthUSD | Coefficient std. err. z P>|z| [95% conf. interval]
--------------------+----------------------------------------------------------------
lngdp_o | .6765901 .457133 1.48 0.139 -.2193741 1.572554
lngdp_d | 1.648271 .3677026 4.48 0.000 .9275873 2.368955
lndistw | 0 (omitted)
lnsumgdp | .2850671 .5995587 0.48 0.634 -.8900464 1.460181
comcol | 0 (omitted)
col45 | 0 (omitted)
comlang_off | 0 (omitted)
lnsmp_dest | -.4195915 .558553 -0.75 0.453 -1.514335 .6751524
t | -.2460435 .1905135 -1.29 0.197 -.619443 .1273561
t_plus_1 | -.2669294 .178963 -1.49 0.136 -.6176903 .0838316
t_plus_2 | -.1795388 .1410546 -1.27 0.203 -.4560008 .0969232
t_plus_3 | -.2099838 .148699 -1.41 0.158 -.5014284 .0814609
t_plus_4 | -.059756 .1373693 -0.44 0.664 -.3289948 .2094828
t_minus_2 | .2546932 .1470864 1.73 0.083 -.0335909 .5429773
t_minus_3 | .1684684 .1827159 0.92 0.357 -.1896482 .526585
t_minus_4 | .0784418 .244383 0.32 0.748 -.40054 .5574237
t_minus_5 | .0225643 .2182389 0.10 0.918 -.4051762 .4503047
financial_t | .2458805 .333316 0.74 0.461 -.4074069 .8991678
financial_t_plus_1 | .174059 .3044075 0.57 0.567 -.4225687 .7706866
financial_t_plus_2 | -.2302709 .3156371 -0.73 0.466 -.8489081 .3883664
financial_t_plus_3 | .1033957 .2743018 0.38 0.706 -.434226 .6410174
financial_t_plus_4 | .0790977 .2657346 0.30 0.766 -.4417325 .5999279
financial_t_minus_2 | -.1150269 .283329 -0.41 0.685 -.6703416 .4402878
financial_t_minus_3 | -.0931899 .3020419 -0.31 0.758 -.6851811 .4988013
financial_t_minus_4 | .0428981 .3582052 0.12 0.905 -.6591712 .7449675
financial_t_minus_5 | .2522656 .3699683 0.68 0.495 -.4728589 .9773901
_cons | -45.87423 21.27649 -2.16 0.031 -87.57539 -4.173076
-------------------------------------------------------------------------------------

Do these information already help?

Best
Noemi
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10190

27 Aug 2024, 02:52

No, I need to be able to reproduce the regression results. See if you can create an example from the dataset below.

Code:

use "http://fmwww.bc.edu/RePEc/bocode/e/EXAMPLE_TRADE_FTA_DATA" if category=="TOTAL", clear
egen imp = group(isoimp)
egen exp = group(isoexp)
ppmlhdfe trade fta, a(imp#year exp#year imp#exp) cluster(imp#exp)

Comment

Noemi Seng

Join Date: Jan 2024

Posts: 90
#5

28 Aug 2024, 02:02

Dear Andrew Musau

thank you for the data example. With your data I did the following:

Code:

use "http://fmwww.bc.edu/RePEc/bocode/e/EXAMPLE_TRADE_FTA_DATA" if category=="TOTAL", clear egen imp = group(isoimp) egen exp = group(isoexp) ppmlhdfe trade fta, a(imp#year exp#year imp#exp) cluster(imp#exp) gen t = (isoimp == "GBR" & year == 1996) gen t_plus_1 = ( isoimp == "GBR" & year == 2000) gen t_plus_2 = ( isoimp == "GBR" & year == 2004) gen t_minus_1 = (isoimp == "GBR" & year == 1992) gen t_minus_2 = (isoimp == "GBR" & year == 1988) gen fta_t = (fta == 1 & t == 1) gen fta_t_plus_1 = (fta == 1 & t_plus_1 == 1) gen fta_t_plus_2 = (fta == 1 & t_plus_2 == 1) gen fta_t_minus_1 = (fta == 1 & t_minus_1 == 1) gen fta_t_minus_2 = (fta == 1 & t_minus_2 == 1) ppmlhdfe trade ln_distw contig colony comlang_off comleg t_minus_2 t_minus_1 t t t_plus_1 t_plus_2 fta_t_minus_2 fta_t_minus_1 fta_t fta_t_plus_1 fta_t_plus_2 , cluster(imp#exp)

I did not include fixed effects as that led to collinearity omission of the event time dummies. As the data set only captured 5 years, it is a model with very few event time dummies.

The regression yields:

. ppmlhdfe trade ln_distw contig colony comlang_off comleg t_minus_2 t_minus_1 t t t_plus_1 t_plus_2 fta_t_minus_2 fta_t_minus_1 fta_t fta_t_plus_1 fta_t_plus_2 , cluster(imp#exp)
note: 1 variable omitted because of collinearity: t
Iteration 1: deviance = 5.4676e+10 eps = . iters = 1 tol = 1.0e-04 min(eta) = -1.59 P
Iteration 2: deviance = 4.7290e+10 eps = 1.56e-01 iters = 1 tol = 1.0e-04 min(eta) = -1.82
Iteration 3: deviance = 4.6565e+10 eps = 1.56e-02 iters = 1 tol = 1.0e-04 min(eta) = -1.86
Iteration 4: deviance = 4.6542e+10 eps = 4.76e-04 iters = 1 tol = 1.0e-04 min(eta) = -1.94
Iteration 5: deviance = 4.6542e+10 eps = 1.04e-06 iters = 1 tol = 1.0e-04 min(eta) = -1.94
Iteration 6: deviance = 4.6542e+10 eps = 7.84e-12 iters = 1 tol = 1.0e-05 min(eta) = -1.94 S O
------------------------------------------------------------------------------------------------------------
(legend: p: exact partial-out s: exact solver h: step-halving o: epsilon below tolerance)
Converged in 6 iterations and 6 HDFE sub-iterations (tol = 1.0e-08)

PPML regression No. of obs = 5,950
Residual df = 1,189
Statistics robust to heteroskedasticity Wald chi2(15) = 504.10
Deviance = 4.65423e+10 Prob > chi2 = 0.0000
Log pseudolikelihood = -2.32712e+10 Pseudo R2 = 0.2407

Number of clusters (imp#exp)= 1,190
(Std. err. adjusted for 1,190 clusters in imp#exp)
-------------------------------------------------------------------------------
| Robust
trade | Coefficient std. err. z P>|z| [95% conf. interval]
--------------+----------------------------------------------------------------
ln_distw | -.1280395 .1178388 -1.09 0.277 -.3589993 .1029202
contig | 1.861247 .3870988 4.81 0.000 1.102547 2.619947
colony | -.1748891 .3871272 -0.45 0.651 -.9336445 .5838663
comlang_off | .2923782 .2398907 1.22 0.223 -.1777989 .7625553
comleg | -.0969236 .1542019 -0.63 0.530 -.3991538 .2053066
t_minus_2 | .2464763 .4263148 0.58 0.563 -.5890854 1.082038
t_minus_1 | .3643847 .4272125 0.85 0.394 -.4729363 1.201706
t | .8321231 .4087671 2.04 0.042 .0309542 1.633292
t | 0 (omitted)
t_plus_1 | 1.362596 .3649508 3.73 0.000 .647306 2.077887
t_plus_2 | 1.513096 .3393175 4.46 0.000 .8480462 2.178146
fta_t_minus_2 | .4993242 .5660291 0.88 0.378 -.6100724 1.608721
fta_t_minus_1 | .3745097 .5654629 0.66 0.508 -.7337772 1.482797
fta_t | .0743812 .5473484 0.14 0.892 -.9984019 1.147164
fta_t_plus_1 | -.3461088 .4899957 -0.71 0.480 -1.306483 .6142651
fta_t_plus_2 | -.1024637 .4757111 -0.22 0.829 -1.03484 .8299129
_cons | 15.76353 .9989984 15.78 0.000 13.80553 17.72154
-------------------------------------------------------------------------------

What I am looking for to do is to do a coefficient plot of the coefficients of t-2, t-1, t, t+1, t+2. So the x axis should be the event time line with a pre-event window (t-2, t-1) and a post-event window (t+1, t+2) where I see for each event time the coefficient with its confidence intervals as with:

[CODE]coefplot, vertical drop (_cons ln_distw contig colony comlang_off comleg fta_t_minus_ 2 fta_t_minus_1 fta_t fta_t_plus_1 fta_t_plus_2) [CODE]

which gives me this plot:

But what I want is to overlay this plot with the coefficients for the interaction terms, so parallel to the t_minus_2, t_minus_1 coefficients, I want the coefficients for fta_t_minus_2, fta_t_minus_1 etc., such that the x-axis (the event time) is valid for both, the coefficients of the event time dummies as well as for the coefficients of the interaction terms. Does that make sense?

I appreciate your help!

Best
Noemi
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10190

28 Aug 2024, 07:22

You just need the coefficients to be named consistently. Consider a renaming "trick":

Code:

use "http://fmwww.bc.edu/RePEc/bocode/e/EXAMPLE_TRADE_FTA_DATA" if category=="TOTAL", clear
egen imp = group(isoimp)
egen exp = group(isoexp)
ppmlhdfe trade fta, a(imp#year exp#year imp#exp) cluster(imp#exp)


gen t = (isoimp == "GBR" & year == 1996)
gen t_plus_1 = ( isoimp == "GBR" & year == 2000)
gen t_plus_2 = ( isoimp == "GBR" & year == 2004)
gen t_minus_1 = (isoimp == "GBR" & year == 1992)
gen t_minus_2 = (isoimp == "GBR" & year == 1988)

gen fta_t = (fta == 1 & t == 1)
gen fta_t_plus_1 = (fta == 1 & t_plus_1 == 1)
gen fta_t_plus_2 = (fta == 1 & t_plus_2 == 1)
gen fta_t_minus_1 = (fta == 1 & t_minus_1 == 1)
gen fta_t_minus_2 = (fta == 1 & t_minus_2 == 1)

ppmlhdfe trade ln_distw contig colony comlang_off comleg  t_minus_2 t_minus_1 t t t_plus_1 t_plus_2 fta_t_minus_2 fta_t_minus_1 fta_t fta_t_plus_1 fta_t_plus_2 ,  cluster(imp#exp)
estimates store m1

*RENAME
rename (t_* fta_t_*) (fta_t_* t_*)
ppmlhdfe trade ln_distw contig colony comlang_off comleg  t_minus_2 t_minus_1 t t t_plus_1 t_plus_2 fta_t_minus_2 fta_t_minus_1 fta_t fta_t_plus_1 fta_t_plus_2 ,  cluster(imp#exp)
estimates store m2

*REVERSE RENAME
rename (t_* fta_t_*) (fta_t_* t_*)
coefplot m1 m2, keep(t*) msy(Oh) msize(1.5) vert recast(connected) ciopts(recast(rcap)) offset(0) leg(order(2 "Wanted 1" 4 "Wanted 2"))

Click image for larger version

Name: Graph.png
Views: 1
Size: 33.1 KB
ID: 1762544

Comment

Noemi Seng

Join Date: Jan 2024

Posts: 90
#7

28 Aug 2024, 10:05

Dear Andrew Musau

thank you so much, that was exactly what I wanted to do and it works also with my own regression. One question left: Do you know by any chance whether it is possible to make the line that connects the coefficients of m2 dotted while leaving the line for m1 continuous? By "lpattern(dot)", I can just make both lines dotted...

Best
Noemi
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10190

28 Aug 2024, 13:22

I have a dashed line instead of a dotted line below. You can change this as you please.

Code:

use "http://fmwww.bc.edu/RePEc/bocode/e/EXAMPLE_TRADE_FTA_DATA" if category=="TOTAL", clear
egen imp = group(isoimp)
egen exp = group(isoexp)
ppmlhdfe trade fta, a(imp#year exp#year imp#exp) cluster(imp#exp)


gen t = (isoimp == "GBR" & year == 1996)
gen t_plus_1 = ( isoimp == "GBR" & year == 2000)
gen t_plus_2 = ( isoimp == "GBR" & year == 2004)
gen t_minus_1 = (isoimp == "GBR" & year == 1992)
gen t_minus_2 = (isoimp == "GBR" & year == 1988)

gen fta_t = (fta == 1 & t == 1)
gen fta_t_plus_1 = (fta == 1 & t_plus_1 == 1)
gen fta_t_plus_2 = (fta == 1 & t_plus_2 == 1)
gen fta_t_minus_1 = (fta == 1 & t_minus_1 == 1)
gen fta_t_minus_2 = (fta == 1 & t_minus_2 == 1)

ppmlhdfe trade ln_distw contig colony comlang_off comleg  t_minus_2 t_minus_1 t t t_plus_1 t_plus_2 fta_t_minus_2 fta_t_minus_1 fta_t fta_t_plus_1 fta_t_plus_2 ,  cluster(imp#exp)
estimates store m1

*RENAME
rename (t_* fta_t_*) (fta_t_* t_*)
ppmlhdfe trade ln_distw contig colony comlang_off comleg  t_minus_2 t_minus_1 t t t_plus_1 t_plus_2 fta_t_minus_2 fta_t_minus_1 fta_t fta_t_plus_1 fta_t_plus_2 ,  cluster(imp#exp)
estimates store m2

*REVERSE RENAME
rename (t_* fta_t_*) (fta_t_* t_*)

coefplot (m1, keep(t*) msy(Oh) msize(1.5) ciopts(recast(rcap)) recast(connected)) ///
(m2, keep(t*) ms(Sh) msize(1.5) ciopts(lp(dash) recast(rcap)) recast(connected) lp(dash)), ///
vert offset(0) leg(order(2 "Wanted 1" 4 "Wanted 2"))

Click image for larger version

Name: Graph.png
Views: 1
Size: 38.6 KB
ID: 1762578

Last edited by Andrew Musau; 28 Aug 2024, 13:34.

Comment

Noemi Seng

Join Date: Jan 2024

Posts: 90
#9

03 Sep 2024, 08:31

Dear Andrew Musau ,

awsome, thank you so much. I just need the confidence intervals to be 90% instead of 95%. Do you have an idea how this can be realized in this plot?

Best and thank you very much
Noemi
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4462
#10

03 Sep 2024, 08:33

if you look at

Code:

h coefplot

you will see that one of the options (levels) is for exactly that purpose
Comment
Noemi Seng

Join Date: Jan 2024

Posts: 90
#11

09 Sep 2024, 05:09

Dear Rich Goldstein thank you very much for the hint!

Maybe also linking Andrew Musau on this post:

I have a very tricky adjustment to make on the plot. the coefficient of t_minus_1 has to be omitted from the regression since it should serve as the baseline. In the plot, t_minus_1 should remain on the x-axis, however the coefficeint should just be a zero without confidence interval. It is important that t_minus_1 does not just disappear completely from the plot (as it does if I simply omit the term from my regression) because then, my timeline on the x-axis would jump from t-2 to t. It is important that for t-1 there is a dot at 0 on the y-axis.

I tried to solve my problem including a variable in the regression (in the order of regressors between t_minus_2 and t) which is omitted (hoping that this would appear as a zero coefficient in coefplot), but the coefficient did not appear in the plot and instead, there is now the jump from t-2 to t which I had hoped to eliminate.

Does anyone have an idea? Thank you in advance!

Best
Noemi
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10190

#12

09 Sep 2024, 06:48

Code:

use "http://fmwww.bc.edu/RePEc/bocode/e/EXAMPLE_TRADE_FTA_DATA" if category=="TOTAL", clear
egen imp = group(isoimp)
egen exp = group(isoexp)
ppmlhdfe trade fta, a(imp#year exp#year imp#exp) cluster(imp#exp)


gen t = (isoimp == "GBR" & year == 1996)
gen t_plus_1 = ( isoimp == "GBR" & year == 2000)
gen t_plus_2 = ( isoimp == "GBR" & year == 2004)
gen t_minus_1 = (isoimp == "GBR" & year == 1992)
gen t_minus_2 = (isoimp == "GBR" & year == 1988)

gen fta_t = (fta == 1 & t == 1)
gen fta_t_plus_1 = (fta == 1 & t_plus_1 == 1)
gen fta_t_plus_2 = (fta == 1 & t_plus_2 == 1)
gen fta_t_minus_1 = (fta == 1 & t_minus_1 == 1)
gen fta_t_minus_2 = (fta == 1 & t_minus_2 == 1)

ppmlhdfe trade ln_distw contig colony comlang_off comleg  t_minus_2 t_minus_1 t t t_plus_1 t_plus_2 fta_t_minus_2 fta_t_minus_1 fta_t fta_t_plus_1 fta_t_plus_2 ,  cluster(imp#exp)
estimates store m1

*RENAME
rename (t_* fta_t_*) (fta_t_* t_*)
ppmlhdfe trade ln_distw contig colony comlang_off comleg  t_minus_2 t_minus_1 t t t_plus_1 t_plus_2 fta_t_minus_2 fta_t_minus_1 fta_t fta_t_plus_1 fta_t_plus_2 ,  cluster(imp#exp)
estimates store m2

*REVERSE RENAME
rename (t_* fta_t_*) (fta_t_* t_*)

coefplot (m1, keep(t*) msy(Oh) msize(1.5) ciopts(recast(rcap)) recast(connected)) ///
(m2, keep(t*) ms(Th) msize(1.5) ciopts(lp(dash) recast(rcap)) recast(connected) lp(dash)), ///
vert offset(0) leg(order(2 "Wanted 1" 4 "Wanted 2")) transform(*t_minus_1= @*0)

Click image for larger version

Name: Graph.png
Views: 1
Size: 26.2 KB
ID: 1763365

Comment

Noemi Seng

Join Date: Jan 2024

Posts: 90
#13

13 Sep 2024, 10:18

Dear Andrew Musau

thank you for your response. Although the code works, I see that now, when I include t_minus_1 in the regression, one of the variables is omitted by Stata due to collinearity. It automatically omits t_plus_2 in my case. It is really tricky because the reason why I want to force the coefficient of t_minus_1 to be zero is exactly because t-1 should be omitted from the regression to serve as the reference category. But it seems that I need to include it to be able to have it in the coefplot. Do you have any other idea?

Thank you and best wishes
Noemi
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10190

#14

15 Sep 2024, 12:36

You can use the "o." operator to omit a level, even if including the indicators manually. The following illustrates:

Code:

sysuse auto, clear
tab rep78, gen(repair)
regress mpg repair1 repair2 o.repair3 repair4 repair5, robust

Res.:

Code:

. tab rep78, gen(repair)

     Repair |
record 1978 |      Freq.     Percent        Cum.
------------+-----------------------------------
          1 |          2        2.90        2.90
          2 |          8       11.59       14.49
          3 |         30       43.48       57.97
          4 |         18       26.09       84.06
          5 |         11       15.94      100.00
------------+-----------------------------------
      Total |         69      100.00

.
. regress mpg repair1 repair2 o.repair3 repair4 repair5, robust

Linear regression                               Number of obs     =         69
                                                F(4, 64)          =       2.72
                                                Prob > F          =     0.0370
                                                R-squared         =     0.2348
                                                Root MSE          =     5.2897

------------------------------------------------------------------------------
             |               Robust
         mpg | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
     repair1 |   1.566667   2.333959     0.67   0.504    -3.095953    6.229287
     repair2 |  -.3083333   1.503803    -0.21   0.838    -3.312525    2.695858
     repair3 |          0  (omitted)
     repair4 |   2.233333    1.40478     1.59   0.117    -.5730381    5.039705
     repair5 |   7.930303   2.718488     2.92   0.005     2.499498    13.36111
       _cons |   19.43333   .7718833    25.18   0.000     17.89132    20.97535
------------------------------------------------------------------------------

Comment

Noemi Seng

Join Date: Jan 2024

Posts: 90
#15

16 Sep 2024, 00:44

Dear Andrew Musau

okay, thank you, good to know. However, the variable I omit with o. does then disappear from the coefplot. But I still want to include it with a coefficient value of 0...

Best
Noemi
Comment

Announcement