Hi Statalisters
I am evaluating a training program for GPs to reduce suicide rate and I have some difference-in-difference results that seem counterintuitive. Descriptive statistics indicate that there should be a reduction in the suicide rate in the intervention region, but the DD-results shows no statistically significant effect. As it seems to be a null-finding I want to be completely sure that it's based on a well-specified model. I've been working on the model for some time and can't seem to find any misspecifications other than potentially how the standard errors are handled. I would be very happy for some input on the specifications of the SEs and whether there are other possible misspecifications.
Here's info on the model: I follow six country regions where two regions are control regions ("intervention" = 0) and four are intervention regions ("intervention" = 1) from 2012 to 2017. The intervention is rolled out between january 2016 to july 2016 so the pre-post variable ("pre_post_all") is defined as 0 for months before july 2016 and 1 for months following july 2016. Main outcome is monthly suicide rate per 100 000. I have checked the common trend assumption visually and formally with a test following the post by Ricardo Cavalho here. I've included fixed-effects, a month variable to control for time and clustered standard errors at the region level.
Here is some descriptive statistics:
How the panel data is set and the model:
And the output:
Following Wing et al. (2018) "Designing Difference in Difference Studies: Best Practice for Public Health Policy Research" who point out that studies with small numbers of cluster should account for this, I've tried a cluster bootstrap approach to the standard errors:
I've also tried different follow-up periods to examine whether there is a time-limited effect that vanishes 6 or 12 months after the intervention, e.g. by
Here's a data example:
I am evaluating a training program for GPs to reduce suicide rate and I have some difference-in-difference results that seem counterintuitive. Descriptive statistics indicate that there should be a reduction in the suicide rate in the intervention region, but the DD-results shows no statistically significant effect. As it seems to be a null-finding I want to be completely sure that it's based on a well-specified model. I've been working on the model for some time and can't seem to find any misspecifications other than potentially how the standard errors are handled. I would be very happy for some input on the specifications of the SEs and whether there are other possible misspecifications.
Here's info on the model: I follow six country regions where two regions are control regions ("intervention" = 0) and four are intervention regions ("intervention" = 1) from 2012 to 2017. The intervention is rolled out between january 2016 to july 2016 so the pre-post variable ("pre_post_all") is defined as 0 for months before july 2016 and 1 for months following july 2016. Main outcome is monthly suicide rate per 100 000. I have checked the common trend assumption visually and formally with a test following the post by Ricardo Cavalho here. I've included fixed-effects, a month variable to control for time and clustered standard errors at the region level.
Here is some descriptive statistics:
Yearly suicide rate per 100 000 | Control region | Intervention region |
2012 | 11.39 | 11.02 |
2013 | 10.75 | 8.94 |
2014 | 11.03 | 7.29 |
2015 | 8.94 | 8.89 |
2016 | 7.81 | 8.20 |
2017 | 10.70 | 6.27 |
How the panel data is set and the model:
Code:
xtset region monthly_date xtreg suiciderate i.intervention##i.pre_post_all monthly_date, fe cluster(region)
Code:
Fixed-effects (within) regression Number of obs = 432 Group variable: region Number of groups = 6 R-sq: Obs per group: within = 0.0592 min = 72 between = 0.1034 avg = 72.0 overall = 0.0565 max = 72 F(3,5) = 16.75 corr(u_i, Xb) = 0.0311 Prob > F = 0.0049 (Std. Err. adjusted for 6 clusters in region) ------------------------------------------------------------------------------------------- | Robust suiciderate | Coef. Std. Err. t P>|t| [95% Conf. Interval] --------------------------+---------------------------------------------------------------- intervention | Intervention | 0 (omitted) 1.pre_post_all | .1059561 .2775733 0.38 0.718 -.6075687 .8194809 | intervention#pre_post_all | Intervention#1 | -.0922136 .3400909 -0.27 0.797 -.9664452 .782018 | monthly_date | -.0048216 .0016342 -2.95 0.032 -.0090225 -.0006206 _cons | 3.938659 1.092318 3.61 0.015 1.130765 6.746552 --------------------------+---------------------------------------------------------------- sigma_u | .1652054 sigma_e | .35753605 rho | .17594101 (fraction of variance due to u_i) -------------------------------------------------------------------------------------------
Code:
xtreg suiciderate i.intervention##i.pre_post_all monthly_date, fe vce(bootstrap) cluster(region)
Code:
// Post 12 m gen byte pre_post12 = monthly_date > tm(2016m6) & monthly_date < tm(2017m6) if !missing(monthly_date)
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float(suiciderate intervention) long region float(monthly_date year) .4736469 0 1 625 2012 .3724667 0 1 673 2016 1.1757903 1 4 652 2014 .2885867 1 5 684 2017 .9631134 1 4 642 2013 .9401236 1 5 639 2013 .6916414 1 4 648 2014 .376433 1 6 648 2014 1.990496 1 2 638 2013 .4268652 1 6 685 2017 .3213006 1 2 684 2017 .7600408 1 5 663 2015 .635285 1 2 663 2015 .19118175 1 5 673 2016 .2885867 1 5 695 2017 end format %tmMCY monthly_date format %ty year label values intervention intervention label def intervention 0 "Control", modify label def intervention 1 "Intervention", modify label values region region label def region 1 "North central region", modify label def region 2 "North east region", modify label def region 4 "South central region", modify label def region 5 "South east region", modify label def region 6 "South west region", modify
Comment