Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Collinearity in variable of interest - DID panel data

    Hello,

    I am having a problem with STATA omitting my variable of interest in my Difference in Difference analysis. I am analysing the effect of a U.S. manufacturing protectionist policy change on manufacturing employment growth in each U.S. state. The policy change is introduced in 2009. I have panel data on 51 different U.S. states and their employment growth over the period 2001-2016. To proxy for the policy change I created a year dummy variable for year>2008. The difference in treatment and control states is based on their level of manufacturing exposure in 2008, which is measured by the fraction of manufacturing GDP in their total GDP. States are divided between marginal (1), low (2), medium (3) and high (4) level of exposure based on the distribution of exposure among the states, where states with marginal exposure are the control group (almost 0% manufacturing exposure) and states with low, medium and high exposure the treatment group.

    I applied the following code:
    Code:
    *generate post treatment period variable:
    gen posttreatment=1 if Year>2008
    
    *generate treatment group variable:
    gen treatmentgroup=1 if exposure>1
    
    *generate interaction term
    gen interaction = treatmentgroup*posttreatment


    Hereafter I applied the following regression (without controls):
    Code:
     xtreg employmentgrowth posttreatment treatmentgroup interaction, fe vce(robust)


    I also used the code in the recommended way that is stated on statalist:
    Code:
    xtreg employmentgrowth i.treatmentgroupl##i.posttreatment, fe vce(robust)


    The problem is that STATA keeps omitting all of the dummy variables because of collinearity. I also tried random effects but this did not work either.

    Does anyone know what I can do? I am really stressed out since I am not getting any results now except for a constant.

    Are there any solutions to this? What should I change to be able to analyse the effect of the policy change?

    Thank you in advance for the answer!
    Last edited by Esmee de Bruin; 12 Apr 2018, 09:07.

  • #2
    Code:
    *generate post treatment period variable:
    gen posttreatment=1 if Year>2008
    
    *generate treatment group variable:
    gen treatmentmanuhml=1 if Treatmentmanu>1
    These are wrong. Instead of generating the 0/1 variables you need, these create ./1 variables. But then the observations where the value is . get omitted from the estimation sample. Consequently your estimation sample consists exclusively of observations for which posttreatment and treatmentmanuhml are both 1. That is, they are constants, so of course, they are omitted from the estimation. What you need is:

    Code:
    gen posttreatment = (Year > 2008)
    gen treatmentmanuhml = (Treatmentmanu > 1)
    Things should work more normally after that.

    Comment


    • #3
      Thank you so much, it works normally now!

      Comment

      Working...
      X