Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Inclusion of covariates in non-linear DiD estimation

    I have two questions concerning the inclusion of control variables in non-linear difference-in-differences models. For example: I want to estimate the impact of a (non-experimental) training programme for adult workers on the likelihood of being employed. The two groups differ in their level of education. As skill demands in the labour market might change over time asymetrically between skill levels, these differences might cause diverging trends even in the absence of the treatment, therefore violating the common trends assumption. I therefore want to allow for diverging trends w.r.t. level of education.

    Jeff Wooldridge has recently suggested new approaches to non-linear DiD (https://doi.org/10.1093/ectj/utad016). It is based on an imputation approach though the estimation is also possible via pooled estimation and is possible with and without covariates. When including covariates (in the pooled estimation), this entails including a full set of interactions between the time-varying treatment variable (treat_dyn) and demeaned covariates (educ_dm), time-invariant treatment group dummy (treat) and covariates, and the post-treatment dummy (after) and covariates. I can replicate the equivalence between the pooled estimation and the imputation approach for the both the models with and without covariates.

    Now my questions: in the dataset I am actually using, I have a rather small treatment group (about 60 persons). I would therefore prefer a more parsimonious model that only includes an interaction between education and time. If I include the full set of interactions, the coefficients do not change but the standard errors get very large. However, I do not manage to replicate the equivalence between the imputation approach and the pooled estimation when I only include the time x education interaction (though differences are minor with only one covariate). Therefore, my questions are
    1) does it make sense at all in the non-linear context to only include the education x time interaction, and
    2) how could one replicate the pooled estimation with the imputation approach.

    The following example illustrates my point. I use the LaLonde dataset that is automatically shipped with the ebalance package (I cannot use my own data and results for data protection issues). I first replicate the equivalence between pooled estimation and imputation without and with covariates (full set of interaction). I then do it with only the time interaction where the equivalence breaks down. I would be very happy if someone could help!

    Code:
    ssc install ebalance, replace
    use cps1re74.dta, clear
    
    *Prepare data: reshape, gen after dummy de-meaned covariates and binary outcome
    gen id     = _n
    reshape long re, i(id) j(year)
    gen     after = year == 78
    gen     treat_dyn = treat*after
    gen     employed = re > 0 & re !=.
    sum    educ 
    gen     educ_dm = educ - `r(mean)'
    xtset     id year
    
    *a) without covariates
    logit         employed i.treat_dyn treat after, vce(cluster id)
    margins, dydx(treat_dyn) at(after == 1) subpop(if treat == 1) noestimcheck vce(uncond) 
    scalar     logit_pooled_nocovs = r(table)[1,2]
    
    logit       employed treat after if treat_dyn==0, vce(cluster id) 
    predict  employed_hat if treat_dyn == 1
    gen       treat_ind = employed - employed_hat
    sum      treat_ind 
    scalar   logit_imputation_no_covs = r(mean)
    
    scalar     list logit_pooled_nocovs logit_imputation_no_covs //result:  identical
    
    *b) with covariates - full set of interactions
    logit         employed i.treat_dyn i.treat_dyn#c.educ_dm educ_dm treat c.treat#c.educ_dm after c.after#c.educ_dm, vce(cluster id)
    margins, dydx(treat_dyn) at(after == 1) subpop(if treat == 1) noestimcheck vce(uncond) 
    scalar      logit_pooled_covs = r(table)[1,2]
    
    logit         employed treat after educ_dm c.treat#c.educ_dm c.after#c.educ_dm if treat_dyn==0, vce(cluster id) 
    predict    employed_hat1 if treat_dyn == 1
    gen         treat_ind1 = employed - employed_hat1
    sum        treat_ind1
    scalar     logit_imputation_covs = r(mean)
    
    scalar      list logit_pooled_covs logit_imputation_covs //result:  identical
    
    *c) only interacted with time
    logit         employed i.treat_dyn educ_dm treat after c.after#(c.educ_dm), vce(cluster id)
    margins, dydx(treat_dyn) at(after == 1) subpop(if treat == 1) noestimcheck vce(uncond) 
    scalar     logit_pooled_covs_time_int = r(table)[1,2]
    
    logit        employed treat after educ_dm c.after#c.educ_dm if treat_dyn==0, robust
    predict   employed_hat2 if treat_dyn==1, pr
    gen        treat_ind2 = employed - employed_hat2    if treat_dyn == 1
    sum        treat_ind2
    scalar    logit_imputation_covs_time_int = r(mean)
    
    scalar     list logit_pooled_covs_time_int logit_imputation_covs_time_int //result: not identical

  • #2
    added some interactions in the last group.

    Code:
    *c) only interacted with time
    
    logit  employed i.treat_dyn educ_dm treat after c.after#c.educ_dm c.treat#c.educ_dm i.treat_dyn#c.educ_dm, vce(cluster id)
    margins, dydx(treat_dyn) at(after == 1) subpop(if treat == 1) noestimcheck vce(uncond) 
    scalar     logit_pooled_covs_time_int = r(table)[1,2]
    
    logit  employed treat after educ_dm c.after#c.educ_dm c.treat#c.educ_dm  if treat_dyn==0, robust
    capture drop emp*hat2 tre*ind_2
    predict   employed_hat2 if treat_dyn==1, pr
    gen   treat_ind2 = employed - employed_hat2    if treat_dyn == 1
    sum   treat_ind2
    scalar logit_imputation_covs_time_int = r(mean)
    
    scalar     list logit_pooled_covs_time_int logit_imputation_covs_time_int //result: not identical

    Comment


    • #3
      Thank you very much for your help, George!

      But if I see it correctly, our suggestion is now identical with my version b) - the full set of interaction.

      What I am aiming at is to only include an interaction between education and time. I know that this is possible in the linear case, both the jwdid and xthdidregress commands have such an option. However, I am not sure whether it is evenly possible in the non-linear case (at least, it does not seem to be identical to the imputation approach, that is why I am a bit insecure).

      Comment


      • #4
        why do you want to modify the procedure by using fewer interactions? they are there for good reason.

        Comment


        • #5
          I do see the merit of the full set of interactions as they allow the covariates to impact both groups differently and the treatment effect to vary with covariates. However, in the dataset I am actually using (which I cannot post here due to data security issues), the treatment group is very small (about 60 persons). With all interactions included, the confidence therefore get so large that the estimates are useless. I would therefore prefer a more parsimonious specification.

          Comment

          Working...
          X