Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Question on heterogenous effects

    Dear statalisters,

    I Hope I can get your opinion on this topic.

    For heterogeneity of treatment effects, usually the literature goes about it in two ways: Split the sample and run two different regressions for each sub-population , or you just do interaction effects by using the treatment variable with the covariate that defines the characteristic for the heterogeneity (in full sample)

    I am interested in the second method, but I wonder if its at all necessary to interact the treatment variable with all covariates, or if its valid to just do so with one. so:

    reg Y Treatment cov1 cov2 cov3 cov1xTreatment, vce(robust). If going for this, would this also hold for a logit regression and Binary outcome variable_

    Looking forward to your answers,


    Thank you.

  • #2
    I wonder if its at all necessary to interact the treatment variable with all covariates, or if its valid to just do so with one.
    It depends.

    Let's back up a minute and look what happens when you split the sample and run two regressions. The effects of the covariates will be estimated separately for each of the subsamples, and those results will almost always be different in the subsamples.

    If you do an interaction model which includes interactions of the subsample-indicator variable with all of the covariates, then you are faithfully emulating the split-sample process. The interaction terms will provide separate estimates of the covariates' effects in each subsample, and, in fact, if you calculate them out from the results, those effect estimates will be identical (except perhaps for minuscule rounding errors) to those you would get in a split sample analysis. (The standard errors, however, will be different because you get a boosted sample size in the interaction approach. But if there is no secondary research aim of testing hypotheses about the effects of the covariates, then this is of no importance.)

    If you do an interaction model which only interacts the treatment variable with the variable that identifies the subsamples, then each covariate effect is going to get only one estimate, a pooled estimate based on the entire sample. It is equivalent to doing separate regressions but imposing a constraint that the coefficients of the covariates must be equal across the subsamples. And, of course, this will in turn influence the estimates of the treatment variable's effect to a greater or lesser extent, depending on how strong the correlations between the covariates and the treatment and subsample-identifying variables are.

    So the answer to your question depends on a number of pragmatic considerations:
    • Is there any reason to believe that the covariate effects differ across the subsamples? If you think that, apart from sampling variation, they don't, then there is no need to interact with the covariates.
    • If you do believe that the covariate effects differ across the subsamples, but if all the covariates are weakly associated with the treatment variable (e.g. if treatment was randomly assigned), then it may not matter which way you do it: the treatment effect estimates will be similar either way.
    • If you do believe that the covariate effects differ substantially across the subsamples and some of them are strongly associated with the treatment or subgroup variables, then it could matter greatly, and the estimates based on interactions that include the covariates would be more valid. But if there are a large number of covariate degrees of freedom, the number of interaction terms might lead to more than you can reasonably include in a regression analysis with the sample size available to you. In that case, you need to get a bigger data set to accommodate this.
    • A "hybrid" approach is possible in which you interact the subsample-indicator with some, but not all, of the covariates. You would interact it with those covariates that are mostly likely to have different effects in the subgroups and are more than trivially correlated with the treatment variable.
    And, no, it doesn't matter which type of regression we're talking about. The considerations are the same regardless.

    Comment


    • #3
      Clyde,

      Thank you for such a thorough answer.

      I will add some details of the research which I would be helpful to t ro make a decision as to which approach to follow
      1. The treatment was randomized, and specifically there was stratified randomization based on the covariates that are part of the regressions
      2. Although the ITT has big sample, the actual people who responded to treatment is quite low, and I do have concerns regarding the number of interaction terms that I can reasonably include in a regression analysis with the sample size of completers.
      3. From a theoretical standpoint, one covariate is of interest, more than the rest, which seem important to control for, but not necessarily a reason to think that treatment effects may vary across them
      Hope this helps,
      Thanks!

      Comment


      • #4
        Based on what you say in #3, it sounds like interacting with that one covariate of interest and the treatment variable, but not other covariates, would be a sensible approach.

        Comment

        Working...
        X