Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Collinearity when estimating triple differences

    Dear all,

    I would like to ask you whether you can help me with resolving collinearity problem I encounter when I tried to estimate difference-in-differences-in-differences model.

    I would like to see the impact of legislation on companies (have hundreds of thousands of those over 10 years - unbalanced panel), but beside of those influenced, I have also two control groups - companies that were not inluenced by belonging to other industry (control group 1) or within other countries but in the same industry (control group 2).

    I have specified the dummies:

    *all companies from NACE 4 industry (services) = 1
    gen industry = 1 if substr(nace1,1,1) == "4" | substr(nace2,1,1) == "4"
    *all observations after year of implementation = 1
    gen imp = 1 if year >= 2009
    *all companies within countries affected = 1
    gen country = 1 if eu == 1

    mvencode industry imp country, mv(0)


    Therefore 111 stands for company within industry and country after treatment and everything else are those before treatment or companies outside of the country (or EU to be specific) or outside of the industry.

    After that I tried to see its impact on total factor productivity (dtfp) without extremes (2 percents from both sides of sample):

    qui sum dtfp, detail
    xtreg dtfp imp##industry##country i.year if ( dtfp > r(p1) & dtfp < r(p99) ), fe cluster(company)


    After that, when I checked for collinearity I have discovered that the collinearity between triple diff and one of the double diffs is 0.98, which can lead to biased estimates of triple diff, which is my variable of interest and therefore it cannot be excluded. Also, all double and single diffs are needed for triple diff to be proper.

    I have tried to specified the regression as:

    xtreg dtfp imp#industry#country industry#country imp#country imp#industry i.year if ( dtfp > r(p1) & dtfp < r(p99) ), fe cluster(company)

    in order to exclude single diffs that are already incorporated by year dummies (i.year) and company fe, but the problem stays the same.

    Can anyone please guide me into direction which I should look in order to deal with this problem? Thank you very much.

    Best regards,

    Vojtech

  • #2
    First, colinearity does not lead to bias. You might look at the treatment of colinearity in Art Goldberger's econometrics text. Colinearity can give you very high standard errors (but they are correct standard errors reflecting the difficulty of estimating the parameters), and it can make the estimates particularly sensitive to a small number of observations.

    With all the interactions being dummies, you are essentially estimating dummies for each of the conditions created by the intersection of the three variables. The real problem is that you may not have enough observations in any given category to get precise estimates. You don't interpret anything except the full effects (i.e., the effect for each of the final categories created by the three way interaction).

    That is, you can interpret the following:
    var1 var2 var3
    0 0 0
    1 0 0
    0 1 0
    0 0 1
    1 1 0
    ...
    111.
    [Actually, one of these must be dropped to avoid colinearity with the intercept, but Stata does this automatically.]

    You can't naively interpret the coefficients on any of the first or two way interactions as general propensities - these are only the effects if the higher level stuff is zero. You might also look at Friedrich, In Defense of Multiplicative Terms... American J of Political Science 1982 - it has a nice discussion of interactions.

    Comment

    Working...
    X