Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dummy variables regression- avoiding multicollinearity

    I am currently running a completely flexibile regression, i.e a regression of y only on dummy variables. It is of very high dimension (more than 200 dummy variables). A snapshot of my data looks like the following:

    Code:
    clear
    input float(Y D1 D2 D3 D4 D5 D6)
     1 1 0 0 0 0 0
     2 0 1 0 0 0 0
     3 1 0 0 0 0 0
     4 0 0 0 0 1 0
    56 1 0 0 0 0 0
     1 0 0 0 0 0 1
    21 0 0 1 0 0 0
    end

    where Y is the regressand and D1 is a full set of dummy variables that go on till D200. I have thousands of observations. To avoid the dummy variable trap, I have dropped the constant. Something strange happens- when I force the constant to be dropped, the coefficients are identified, as (X'X) is a full rank matrix. However, when I estimate the model with FE and no constant, it still drops one variable! The thing is, because there are so many observations, even when I drop a variable, the determinant is still close to 0. But the only time it should drop a variable is when the determinant is exactly 0. Any ideas? Thanks!

  • #2
    I can't quite follow from your explanation what is happening. Why don't you show us the exact regression commands you are using and the exact output you are getting from Stata. Please do this by copying directly from Stata's Results window or your log file and pasting into a code file. Please do not edit any of it: the details are usually crucial.

    Comment


    • #3
      Thanks for your response. Unfortunately, the output is quite large. I will try to figure out the core of the problem, and then post it. Thanks!

      Comment

      Working...
      X