Omitted dummy variables in panel data regression

Tessa Koning

Join Date: Jun 2022

Posts: 2
#1

Omitted dummy variables in panel data regression

15 Jun 2022, 09:54

Dear experts,

Using STATA, I have performed fixed effect model for my panel data (7 years, 1000+ obs). In this model, 9 dummy variables indicating the industry are included, but 8 of those 9 get omitted.

To give a bit more information about my regression:
- The dependent variable is CEO compensation
- The independent and control variable include, among others, %females on board, %board independence, board size, tenure, age, gender dummy, industry dummies (10 groups of industries, so 9 dummies).

Industry is an important dummy variable, as it can have an effect on the amount of bonuses, and thus total compensation, can be given to the CEO (at least in the Netherlands). However, is this double? Since I'm already testing for individual effects? Anyways, I used the following code:

global id id
global year year
global ylist lncompensation
global xlist fob independence fib boardsize focc age ceotenure tenure firstyear female d2019 lnrevenue roa d1 d2 d3 d4 d5 d6 d7 d8 d9

* Set data as panel data*
sort $id $year
xtset $id $year
xtdescribe
xtsum $id $year $ylist $xlist

* Fixed effects*
xtreg $ylist $xlist, fe
eststo fe

*Random effects*
xtreg $ylist $xlist
eststo re

* Hausman test for fixed versus random effects model*
hausman fe re

The output of the Hausman test:
chi2(12) = (b-B)'[(V_b-V_B)^(-1)](b-B)
= 18.57
Prob > chi2 = 0.0995
(V_b-V_B is not positive definite)

Would you suggest any idea to get this dummy variable included in the FE regression. Or is it better to use the RE regression?

Thanks,

Tessa
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17854
#2

15 Jun 2022, 10:25

Tessa:
welcome to this forum.
Some comments about your query:
1) as per FAQ, you're kinfìdly requested to show (within CODE delimiters, please) what you typed and what Stata gave you back;
2) the way you created (by hand, I suppose) the dummy variable is far from efficient; please take a look at -fvvarlist- notation;
3) as far as -industry- is concerned, as you know the -fe- estimator wipes out all time-invariant variable;
4) the -hausmam- outcome that you reported leans toward -re- but it is not diriment. You may want to add the option -sigmaless- or -sigmamore- and see if the matrix becomes positively definite.
That said, there are other (more helpful) way to test which specifcation fits your data better: the community-contributed modules -xtoverid- and -mundlak- are two cases in point.
The Mundlak approach is easy to implement by hand (see https://blog.stata.com/tag/mundlak/), too. I prefer this way vs. the already programmed one because it allows you to understand how the Mundlak correction actually works.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Tessa Koning

Join Date: Jun 2022

Posts: 2
#3

16 Jun 2022, 08:02

Hi Carlo,

Thank you very much for your response. To come back at your points:
1) I'm sorry, I will do that from now on!
2) I created the dummies by hand indeed. The sample isn't too big, only 248, so it was fine. I looked at fvvarlist notation, but couldn't really find how to do this.
3) That is true. However, multiple papers include FE as well as the dummy indicator.

Thanks again for your help!

Kind regards,
Tessa
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17854
#4

16 Jun 2022, 08:39

Tess:
2) it easy. Just create a categorical variables with the different -industry- or -whatever-. Give each level a number and then -label-. Then, you're ready to use -fvvarlist- notation;
3) but why including a predictor that you know will be wiped out on a priori basis if you go -fe-? Probably, you refer to paper that compare -fe- vs. re specification (as -re- gives back a coefficient for time-invariant variables, too).

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

Omitted dummy variables in panel data regression

Comment

Comment

Comment