reghdfe and dropping collinear variables

Paul Pelzl

Join Date: May 2016

Posts: 24
#1

reghdfe and dropping collinear variables

21 Jun 2018, 08:41

I am using reghdfe since I want to easily include fixed effects (firm FE to be precise), which I do with the absorb option.

At some point I include an additional fixed effect (bank FE, to be precise). Then the bank variables I have in my model, which are constant over time, should be dropped. However, they are not: I still see them in the regression, and both the coefficients and the standard errors are huge. However, the p-value of all variables is equal to one (or 0.9999), and the regression output says:

"WARNING: Missing F statistic (dropped variables due to collinearity or too few clusters)."

I have enough clusters (more than 200), so I think this refers to "dropped variables".

Now, of course I could simply infer that STATA actually drops the variables. However, if I use the noomit option, they are still there. Plus, there are two more things that are weird and that I don't understand.

1) The standard errors of the coefficients of the variables that are not collinear (in which I am ultimately interested) are not always the same, if I run the command a couple of times (without changing anything, just hit run again). I see that also the number of iterations changes.

2) If I manually drop the variables that have a p-value of one (or 0.9999) and run the regression again, then the coefficients of the variables (i.e. those that are not collinear and in which I am ultimately interested in) are again quite different (both estimates and standard errors). Shouldn't manually dropping and automatically dropping by the reghdfe produce the same results?

I came across this post: https://www.statalist.org/forums/forum/general-stata-discussion/general/1356821-question-about-reghdfe-and-dropping-a-dummy

I have tried with a different tolerance, but things don't change. Also, it seems like Sergio Correia has not yet implemented the technique of the authors of -ivreg2-.

Can anyone resolve / explain the issues of above or propose a different command in which the above goes more smoothly, but singletons are still dropped (which is the advantage of reghdfe compared to the areg command)? I would greatly appreciate that.
Tags: None
Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#2

22 Jun 2018, 11:02

You didn't get a quick answer. You'll increase your chances of a useful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stata output, and sample data using dataex. Actual copies of what happened will be better than your description.

I'm not clear what you mean by bank fixed effects. If you include firm fixed effects, they'll take care of any industry differences (assuming firms don't switch industries). So, I'm not sure why you have a bank fe or exactly what a bank fe means. As for getting a strange estimate, if even one or two observations have variation within panel over time, you can get estimates. So my suspicion is that that bank variables are not absolutely constant within panels.

1) Not clear what is happening. Sometimes "hitting run again" is not exactly the same - for example, you might be losing something in a local that is not in your run. If you really get such results when you run the entire program, you may need to contact the authors. I have no idea about the details of their optimization process.
2) If you have variables with a p value, that means it ran with those variables. So dropping them is running a different model.
Comment

Announcement

reghdfe and dropping collinear variables

Comment