Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Equal numbers of clusters and independent variables in reghdfe

    This is a question regarding clustering, especially in regard to the user-written command reghdfe (available on ssc; I am running Stata 12.1 on a Mac). I'm attempting to run a regression with -reghdfe- with an equal number of independent variables and clusters and I'm getting an error. When I use -reg- or -areg- there is not an error. My understanding is that this kind of regression is not fundamentally unsound (right?), but then I don't know why I'm getting an error with -reghdfe-. Any help is appreciated. My code is below:

    Code:
    set seed 1212
    
    global o = 200 //number of observations
    
    global s = 20 //number of states
    
    set obs $o
    
    
    
    
    *assign a state and year to every observation
    
    gen state = floor((_n-1)/($o/$s))+1
    
    gen year = _n - floor((_n-1)/($o/$s))*($o/$s)
    
    
    
    
    * state-specific linear time trend
    
    levelsof(state), local(sl)
    
    foreach s in `sl' {
    
    gen year_`s' = 0
    
    replace year_`s' = year if state == `s'
    
    }
    
    
    
    
    * generate an independent variable
    
    qui gen indvar = state*year/100 + runiform()/100
    
    qui gen depvar = indvar + state*year/100 + runiform()/100
    
    
    
    
    * these run and give the same results
    
    reg depvar indvar i.(year state) year_*, vce(cluster state)   
    
    areg depvar indvar i.year year_*, absorb(state) vce(cluster state)   
    
    
    
    
    * this does not run: error message is "insufficient observations (N_clust=20, K=20)"
    
    reghdfe depvar indvar year_*, absorb(state year) vce(cluster state)

  • #2
    Yes, the -reg- and -areg- commands run, but if you look at the output carefully, you will see that the omnibus F tests are skipped. That is because the number of variables is too large for the number of clusters. -reg- and -areg- are being kind to you and giving you the results for the individual independent variables, because you can still draw inferences about those. But none of these commands is giving you an overall test of the model because you don't have enough degrees of freedom left to do that. -reghdfe- is "playing hardball" here. I don't know whether it's because its algorithm can't produce any results at all in this circumstance, or whether the author is just taking a hard line on this kind of regression. He (Sergeio Correa) is a regular contributor to this Forum and I imagine he will have something to say about this.

    But no matter what, you can't do a test of 20 variables when you only have 20 clusters, no matter what program you use.

    Comment


    • #3
      Dear Dan, Clyde,

      Yes, -reghdfe- is currently too harsh when dealing with too few clusters, as discussed in an open github issue

      A quick workaround though is to use a different suboption when calculating the SEs. Thus, you need to run:

      reghdfe depvar indvar year_*, absorb(state year) vce(cluster state, suite(mwc)) That suboption of vce() will use a different method which is more robust to too few clusters. Best, Sergio

      Comment


      • #4
        Thanks very much! I appreciate the help.

        Best,
        Dan

        Comment


        • #5
          Originally posted by Sergio Correia View Post
          Dear Dan, Clyde,

          Yes, -reghdfe- is currently too harsh when dealing with too few clusters, as discussed in an open github issue

          A quick workaround, though, is to use a different suboption when calculating the SEs. Thus, you need to run:

          reghdfe depvar indvar year_*, absorb(state year) vce(cluster state, suite(mwc)) That suboption of vce() will use a different method which is more robust to too few clusters. Best, Sergio
          Hi Sergio,

          I encountered the same problem as stated in this post, and I tried your suggestions with my own depvar, indvar, controls, fe, and clusters. The code I run is as follows:
          Code:
          reghdfe `disc_var' `endogenous' `controls', absorb(`fe') vce(cluster `cluster', suite(mwc))
          However, it gives me the following errors:
          VCE options not supported: suite(mwc)
          Do you know why this is the case? Is it related to a specific version of -reghdfe-? If so, how can I specify which version to install?

          Thank you!

          Comment


          • #6
            Originally posted by Chengmou Lei View Post
            Is it related to a specific version of -reghdfe-? If so, how can I specify which version to install?
            You can call historical versions of reghdfe (from https://github.com/sergiocorreia/reghdfe) using the -version()- option. See

            Code:
            help reghdfe
            Code:
            webuse grunfeld, clear
            reghdfe invest mvalue kstock, absorb(company) vce(cluster year, suite(mwc)) version(3)
            Res.:

            Code:
            . reghdfe invest mvalue kstock, absorb(company) vce(cluster year, suite(mwc)) version(3)
            (running historical version of reghdfe: 3)
            (converged in 1 iterations)
            
            HDFE Linear regression                            Number of obs   =        200
            Absorbing 1 HDFE group                            F(   2,     19) =      98.11
            Statistics robust to heteroskedasticity           Prob > F        =     0.0000
                                                              R-squared       =     0.9441
                                                              Adj R-squared   =     0.9408
                                                              Within R-sq.    =     0.7668
            Number of clusters (year)    =         20         Root MSE        =    52.7680
            
                                              (Std. err. adjusted for 20 clusters in year)
            ------------------------------------------------------------------------------
                         |               Robust
                  invest | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
            -------------+----------------------------------------------------------------
                  mvalue |   .1101238   .0173279     6.36   0.000     .0738561    .1463915
                  kstock |   .3100653   .0322789     9.61   0.000     .2425049    .3776258
            ------------------------------------------------------------------------------
            
            Absorbed degrees of freedom:
            ---------------------------------------------------------------+
             Absorbed FE |  Num. Coefs.  =   Categories  -   Redundant     | 
            -------------+-------------------------------------------------|
                 company |           10              10              0     | 
            ---------------------------------------------------------------+
            
            .

            Comment

            Working...
            X