Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Check if coefficient from two different reghdfe regressions is the same

    Hi all,

    I am using Stata 15 on Windows.

    I have data on houses and study the effect of mixed use on house prices with a hedonic fixed effects model, using the reghdfe command.
    In short, I regress log house price on a MIX dummy and a number of structural house characteristics like number of rooms etc.
    With fixed effects I control for time trends (through year) and location (through street).

    I got the tip to check if my fixed effects are doing their job, by regressing log house price on MIX only and a second time with also the other house variables included (both regressions with year and location fixed effects). If the coefficient for MIX is relatively the same this should indicate that the fixed effect effectively control for unobserved variables.

    I attached the results of the two regressions. My question is now, is there a way that I can see how similar these coefficients for the MIX dummy are? I read something about suest, but that this cannot be used in combination with reghdfe. Or is it something I can deduct based on the s.e.? Also, the R² is of course way lower for the model which includes only MIX, so I was wondering if this approach makes sense in the first place? I am fairly new to econometrics so thoughts are welcome.


    Thank you for reading,
    Lonneke




    Attached Files

  • #2
    You can see whether coefficients are the same by checking whether the confidence intervals overlap (coefficient ± 1.96 * se). If there is overlap, you cannot reject that the coefficients are different. In this case, the coefficients are clearly different. If reghdfe doesn't work, you can carry out a formal test by using reg with year and street dummies instead of reghdfe. So
    Code:
    reg depvar indepvars i.year i.street
    instead of
    Code:
    reghdfe depvar indepvars, absorb(year street)
    This should get you exactly the same coefficients and standard errors, but you can then use suest to formally test whether the coefficients are different or not.

    On a more general note, your fixed effects are on the street level, so they control for street-specific fixed effects like location. However, within a street, houses may differ significantly with regards to size and all the other variables that you control for. Therefore, you cannot expect that street level fixed effects will control for house specific fixed effects. All your control variables are highly significant and to the extent that they are correlated with MIX, this will affect the coefficient of MIX.

    If your control variables like garage and parking don't change over time, which is probably the case in general, house-specific fixed effects will indeed perfectly control for all these variables. Thus, using fixed effects for each house may be better, but this is only possible of MIX is time variant. Otherwise, the above model with the controls is better, but there may be more unobserved variables which are correlated with MIX, so omitted variable bias comes into play here.

    Comment


    • #3
      Hi Wouter,

      Thanks for your reply!
      I tried to control for house-specific fixed effects, but as you already thought, this omits the variable of interest, MIX, because it is time-invariant.
      As for your first suggestion, I get the following error from Stata:

      maxvar too small
      You have attempted to use an interaction with too many levels or attempted to fit a model with too many variables. You
      need to increase maxvar; it is currently 5000. Use set maxvar; see help maxvar.

      If you are using factor variables and included an interaction that has lots of missing cells, either increase maxvar or set
      emptycells drop to reduce the required matrix size; see help set emptycells.

      If you are using factor variables, you might have accidentally treated a continuous variable as a categorical, resulting in
      lots of categories. Use the c. operator on such variables.

      My dataset contains 200.000 observations with many different streets, so I think I cannot use i.street.
      Would it be good practice if I explain what the fixed effects do and why I include them in my model, without giving formal proof of this reasoning?

      Comment


      • #4
        As for your first suggestion, I get the following error from Stata: maxvar too small
        You can use - distinct(street) - to check how many streets there are. Then, set maxvar above that number and try to run the regression again. If it does not work your street variable may not be in the right format. A data example with - dataex - would be helpful in this case.

        Would it be good practice if I explain what the fixed effects do and why I include them in my model, without giving formal proof of this reasoning?
        Generally, you should explain why you choose a certain model, i.e. why the model is suitable to answer your question. Formal proof of what fixed effects do can be found in older papers and econometrics textbooks so you can build on that, there is not need to provide proof again. However, every model rests on certain assumptions, and you may have to provide good reasoning (or tests) to show that the model satisfies these assumptions.

        I tried to control for house-specific fixed effects, but as you already thought, this omits the variable of interest, MIX, because it is time-invariant.
        Fixed effects may not be the right model in this case. Fixed effects models are used to control for time-invariant unobservables when you estimate the effect of a time-variant variable. Using fixed effects on the street level is not equivalent to using fixed effects on the house level because you control for different things. Normally, you use fixed effects at the level of the observation, which in this case are houses, so for a true fixed effects model these house-specific effects should be included. - regress - with control variables may be a better option.

        Comment

        Working...
        X