Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Regression with high dimension fixed effects

    Hi All,

    I am in the midst of running a regression with many fixed effects. As such, I am using the reghdfe command.
    My specific question is as follows:

    I am estimating an equation of employer wages as a function of employer ratings, and employer fixed effects. The dataset is by sector-year (the sector of the employee is specified (say coal production), along with the year the wages were recorded). I am considering augmenting the regression to include the effect of education (separate dummy variables for High School, College, None), fully interacted with year and sector effects). In essence, I am considering a fully flexible model which allows for heterogenous effects by sector-year-education, so that for example, the effect of high school on wages is different in 2007 in the coal sector, relative to different years, different schooling levels, and different sectors. As you can imagine, this saturated model contains a lot of dummy variables, and I am having trouble with the code as it is generating sets of superfluous variables.

    Thus, far, this is what I have:

    Code:
    egen fe=group( employer year sector) ///This generates the employer fixed effects by year, by sector
    reghdfe wages i*education##i.year##schooling*, absorb (beta=fe)   ///The schooling variables are Schooling1, Schooling2 and Schooling3 respectively
    Although the estimation procedure works, it produces alongside it *separate* dummy variables for year effects and sector effects. These are not needed- all I care about are the interactions themselves. The regression procedure in other words produces along with it year dummies and sector dummies separately in addition to the interacted variables. Is there any way I can obviate the creation of these additional variables?

    Thank you so much for all your help!!



  • #2
    Hi Chinmay,
    Couple of comments. If your intention is create a fully interacted fixed effect vector (employer year sector), you could do just as well as using the standard areg command from stata. reghdfe is more appropriate if the idea is to absorb the employer effect, the year effect and sector effect separately.
    Regarding your question onthe interactions, You could get what you want using only one "#" sign.
    Also, be sure that you have "enough observations per combination of employer year and sector. Otherwise, it may be difficult to find any meaningfull results.
    HTH
    Fernando

    Comment


    • #3
      Can you try running reghdfe version 4, from github? https://github.com/sergiocorreia/reghdfe/

      It should fix all or most of the issues with superfluous variables.

      Comment


      • #4
        Thanks a lot Fernando and Sergio! Much appreciated.

        Comment

        Working...
        X