Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • areg and xtreg,fe with cluster option: which one is better?

    Dear all,

    I am doing a FE regression with year and firm fixed effects and tired both:

    -xtreg, fe vce(cluster ID)
    -areg, absorb(ID) vce(cluster ID)

    Both result in different t-values, due to different number of df - as I understand.
    But which one is more accurately to use?

    Thanks in advance,

    Ines

  • #2
    Ines: I just tried an example on a standard balanced panel data sets and was surprised that areg and xtreg give clustered standard errors that are notably different -- just as you report. I know that there are cases with nonnested clusters where this occurs but I thought they would coincide on a standard panel data set. I know the option "dfadj" used with xtreg makes the standard errors coincide with areg, but I'm not sure why it's necessary for standard panel data applications (or cases where each unit belongs to a unique cluster). I've always thought the xtreg clustered standard errors are fine, so I admit to being a little puzzled.

    I tried bootstrapping with 1,000 replications and the bootstrap standard errors are much closer to the xtreg standard errors. Assuming you have "large N, small T" you might want to bootstrap and see what happens. What remains unclear to me is why areg clustered standard errors seem conservative. I suspect someone will soon post the answer.

    Comment


    • #3
      I'd recommend reading the Statalist resposne by Ryan Kessler (who actually cites Prof. Wooldridge): http://www.stata.com/statalist/archi.../msg00596.html

      Comment


      • #4
        This is interesting. Following the discussion mentioned by Sophie, I have prepared a little example

        Code:
        clear all
        set more off
        
        sysuse auto
        keep mpg price rep78
        gen time = _n
        
        * demeaning variables for later regressions
        foreach v in mpg price {
            egen m`v' = mean(`v'), by(rep78)
            gen dm`v' = `v' - m`v'
            drop m`v'
        }
        
        * generating the cluster dummies for later regressions
        quietly tabulate rep78, generate(cid)
        
        * setting group and time dimensions for later estimations
        xtset rep78 time
        
        * Estimations with clustered errors
        areg mpg price, absorb(rep78) vce(cluster rep78)
        xtreg mpg price, fe vce(cluster rep78)
        xtreg mpg price, fe vce(cluster rep78) dfadj
        reg dmmpg dmprice, nocons vce(cluster rep78)
        reg mpg price cid*, nocons vce(cluster rep78)
        
        * Estimations with nonclustered errros
        areg mpg price, absorb(rep78)
        xtreg mpg price, fe
        xtreg mpg price, fe dfadj
        reg dmmpg dmprice, nocons
        reg mpg price cid*, nocons
        It seems that the degrees of freedom are not adjusted when using xtreg, fe with clustered errors, but they are when using xtreg, fe with nonclustered errors. Notice how the two xtreg, fe estimations with nonclustered errors produce the same results, i.e. those that areg produces, so adding the option dfadj makes no difference. This is not the case with clustered errors as it's been pointed out.

        I have included the dummy variable regression and demeaned regressions for the sole purpose of comparison. Notice how when using clustered standard errors, the standard errors of xtreg, fe (without dfadj) do not match of those of the demeaned regression without the dummies. I believe that is because of the inclusion of the constant by xtreg, fe. So it really seems that xtreg, fe is messing up the degrees of freedom when using clustered errors.
        Last edited by Alfonso Sánchez-Peñalver; 29 Jul 2014, 18:31.
        Alfonso Sanchez-Penalver

        Comment

        Working...
        X