Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Omitted dummy variables in panel data regression

    Dear experts,

    Using STATA, I have performed fixed effect model for my panel data (7 years, 1000+ obs). In this model, 9 dummy variables indicating the industry are included, but 8 of those 9 get omitted.

    To give a bit more information about my regression:
    - The dependent variable is CEO compensation
    - The independent and control variable include, among others, %females on board, %board independence, board size, tenure, age, gender dummy, industry dummies (10 groups of industries, so 9 dummies).

    Industry is an important dummy variable, as it can have an effect on the amount of bonuses, and thus total compensation, can be given to the CEO (at least in the Netherlands). However, is this double? Since I'm already testing for individual effects? Anyways, I used the following code:

    global id id
    global year year
    global ylist lncompensation
    global xlist fob independence fib boardsize focc age ceotenure tenure firstyear female d2019 lnrevenue roa d1 d2 d3 d4 d5 d6 d7 d8 d9

    * Set data as panel data*
    sort $id $year
    xtset $id $year
    xtdescribe
    xtsum $id $year $ylist $xlist

    * Fixed effects*
    xtreg $ylist $xlist, fe
    eststo fe

    *Random effects*
    xtreg $ylist $xlist
    eststo re

    * Hausman test for fixed versus random effects model*
    hausman fe re

    The output of the Hausman test:
    chi2(12) = (b-B)'[(V_b-V_B)^(-1)](b-B)
    = 18.57
    Prob > chi2 = 0.0995
    (V_b-V_B is not positive definite)

    Would you suggest any idea to get this dummy variable included in the FE regression. Or is it better to use the RE regression?

    Thanks,

    Tessa

  • #2
    Tessa:
    welcome to this forum.
    Some comments about your query:
    1) as per FAQ, you're kinfìdly requested to show (within CODE delimiters, please) what you typed and what Stata gave you back;
    2) the way you created (by hand, I suppose) the dummy variable is far from efficient; please take a look at -fvvarlist- notation;
    3) as far as -industry- is concerned, as you know the -fe- estimator wipes out all time-invariant variable;
    4) the -hausmam- outcome that you reported leans toward -re- but it is not diriment. You may want to add the option -sigmaless- or -sigmamore- and see if the matrix becomes positively definite.
    That said, there are other (more helpful) way to test which specifcation fits your data better: the community-contributed modules -xtoverid- and -mundlak- are two cases in point.
    The Mundlak approach is easy to implement by hand (see https://blog.stata.com/tag/mundlak/), too. I prefer this way vs. the already programmed one because it allows you to understand how the Mundlak correction actually works.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Hi Carlo,

      Thank you very much for your response. To come back at your points:
      1) I'm sorry, I will do that from now on!
      2) I created the dummies by hand indeed. The sample isn't too big, only 248, so it was fine. I looked at fvvarlist notation, but couldn't really find how to do this.
      3) That is true. However, multiple papers include FE as well as the dummy indicator.

      Thanks again for your help!

      Kind regards,
      Tessa

      Comment


      • #4
        Tess:
        2) it easy. Just create a categorical variables with the different -industry- or -whatever-. Give each level a number and then -label-. Then, you're ready to use -fvvarlist- notation;
        3) but why including a predictor that you know will be wiped out on a priori basis if you go -fe-? Probably, you refer to paper that compare -fe- vs. re specification (as -re- gives back a coefficient for time-invariant variables, too).
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment

        Working...
        X