Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to plot differences in means for diff-in-diff common trend (from Bleakeley & Chin (2004))

    Dear Stata community,

    This is my first post on the forum. I'm struggling to recreate figure 1 from Bleakeley & Chin (2004) [see attached picture].

    I have managed to plot panel A with the raw data (not regression-adjusted) which I think is fine for my project. However it would the interesting to see the regression-adjusted results as well. I have tried the following code
    Code:
    reg var1 var2 var3 var4, r
    
    predict var1_hat if treatment == 1, xb
    predict var1_hat if treatment == 0, xb
    
    graph tw (lowess var1_hat var2 if treatment == 1) (lowess var1_hat var2 if treatment == 0)
    However, the results look weird (both lines are almost on top of each other - maybe I have over-regression-adjusted?).

    More importantly: Can someone point me in the right direction how to approach recreating panel B? I'm really not sure how to get started with that one. Just manually calculating means between both groups? But then how to graph confidence intervals between these lines (important because I suspect the differences might not be statistically significant)?

    Unfortunately I can't show my data or output because it is confidential and on a remote server. However, I hope I have provided enough information for someone to help me in the right direction.



    Click image for larger version

Name:	Skærmbillede 2020-01-05 kl. 11.58.34.png
Views:	1
Size:	242.8 KB
ID:	1530790

  • #2
    Hi,
    I am going to make a wild guess from the top of my head on how you could solve this, albeit it might not be the most parsimonious:
    I would start by running this regression for or each age of arrival-group: Y= b0 + b1 * 1[treatment] + b2*Age + b3*1[race] + b4*1[Hispanic] + b5*1[Female] + e if group==x
    Each estimate of b1 would represent each of the solid black dots in panel b as it captures the differences in means due to the treatment, accounting for the other covariates.
    You should extract each of the resulting b1s and their confidence intervals from the stata output. I'll give you a quick example on how to achive this on a simple univariate example:

    Code:
    sysuse lifeexp, clear
    gen betax=.
    gen rbetax=.
    gen lbetax=.
    reg lexp gnppc
    matrix define B=e(b)
    replace betax=matrix(B[1,1]) in 1
    matrix C=r(table)
    replace rbetax=matrix(C[5,1]) in 1
    replace lbetax=matrix(C[6,1]) in 1
    Now, you would want to check where do b1 and the confidence intervals lie when tackling a multivariate example. More specifically you should check both e(b) and r(table) using matrix list to see which coeficients you are trying to capture. Moreover you should loop the "in 1" at the end of each of the "replace ... in 1", to go from 1 to 17,
    Summarizing, what this does is create 3 variables betax, rbetx and lbetax which represent the point estimation and the lower and upper bounds of each confidence interval. After this you could plot all three variables and you should essentially get panel B.


    Please let me know if any of this is helpful in any way.
    Last edited by Bruno Jimenez; 07 Jan 2020, 16:20.

    Comment

    Working...
    X