Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    That is not necessary. And the assumption is that errors are correlated across cross-sectional units.

    Comment


    • #17
      Sure, thanks Andrew.

      I am now using the stacked samples design but I think I need to use reghdfe as I have a lot of fixed effects in one of my specifications. The problem with that though is that it seems to always exclude the constants rather than omit values for the fixed effects. I am attaching an example with both reg and reghdfe. Would you if there is a way to pick the omitted category in reghdfe?

      Code:
      sysuse auto, clear
      rename (mpg price) (depvar1 depvar2)
      reshape long depvar, i(make) j(which)
      gen cons=1
      regress depvar i.which#(c.weight c.displacement c.cons i.foreign), nocons 
      reghdfe depvar i.which#(c.weight c.displacement c.cons), nocons absorb(i.foreign)
      which gives me:

      Code:
      . regress depvar i.which#(c.weight c.displacement c.cons i.foreign), nocons
      note: 2.which#1.foreign omitted because of collinearity
      
            Source |       SS           df       MS      Number of obs   =       148
      -------------+----------------------------------   F(8, 140)       =    179.68
             Model |  3.1419e+09         8   392733056   Prob > F        =    0.0000
          Residual |   306005883       140  2185756.31   R-squared       =    0.9112
      -------------+----------------------------------   Adj R-squared   =    0.9062
             Total |  3.4479e+09       148  23296421.1   Root MSE        =    1478.4
      
      --------------------------------------------------------------------------------------
                    depvar |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      ---------------------+----------------------------------------------------------------
            which#c.weight |
                        1  |  -.0067745   .5027539    -0.01   0.989    -1.000746    .9871969
                        2  |   2.328626   .5027539     4.63   0.000     1.334654    3.322597
                           |
      which#c.displacement |
                        1  |   .0019286   4.339999     0.00   1.000    -8.578483     8.58234
                        2  |   10.25387   4.339999     2.36   0.020     1.673454    18.83428
                           |
              which#c.cons |
                        1  |   41.84795   1013.101     0.04   0.967    -1961.107    2044.803
                        2  |  -148.7127   865.5382    -0.17   0.864    -1859.928    1562.503
                           |
             which#foreign |
                1#Foreign  |  -1.600631   479.9575    -0.00   0.997    -950.5024    947.3011
               2#Domestic  |   -3899.63   479.9575    -8.12   0.000    -4848.532   -2950.729
                2#Foreign  |          0  (omitted)
      --------------------------------------------------------------------------------------
      
      
      
      . reghdfe depvar i.which#(c.weight c.displacement c.cons), nocons absorb(i.foreign)
      (MWFE estimator converged in 1 iterations)
      note: 2.which#c.cons omitted because of collinearity
      
      HDFE Linear regression                            Number of obs   =        148
      Absorbing 1 HDFE group                            F(   5,    141) =     123.23
                                                        Prob > F        =     0.0000
                                                        R-squared       =     0.8138
                                                        Adj R-squared   =     0.8059
                                                        Within R-sq.    =     0.8138
                                                        Root MSE        =  1637.7880
      
      --------------------------------------------------------------------------------------
                    depvar |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      ---------------------+----------------------------------------------------------------
            which#c.weight |
                        1  |    .245959   .5548105     0.44   0.658    -.8508635    1.342781
                        2  |   2.075892   .5548105     3.74   0.000     .9790696    3.172714
                           |
      which#c.displacement |
                        1  |    4.08701   4.742891     0.86   0.390    -5.289359    13.46338
                        2  |   6.168784   4.742891     1.30   0.196    -3.207586    15.54515
                           |
              which#c.cons |
                        1  |  -207.8225   1353.837    -0.15   0.878    -2884.265     2468.62
                        2  |          0  (omitted)
      --------------------------------------------------------------------------------------
      Also the estimates look quite different?

      Comment


      • #18
        You just have to take care in how you specify the absorb vars in reghdfe (SSC). The intercepts are unnecessary here (specify -nocons- or do not interpret the reported intercept), but you need to interact the absorb vars with their respective group indicators. Don't forget to specify the option -robust- for the standard errors.

        Code:
        gen which1= 1.which
        gen which2= 2.which
        reghdfe depvar i.which#(c.weight c.disp), absorb(i.foreign#which1 i.foreign#which2) vce(robust)
        Res.:

        Code:
        . reghdfe depvar i.which#(c.weight c.disp), absorb(i.foreign#which1 i.foreign#which2) vce(robust)
        (MWFE estimator converged in 2 iterations)
        
        HDFE Linear regression                            Number of obs   =        148
        Absorbing 2 HDFE groups                           F(   4,    140) =      48.76
                                                          Prob > F        =     0.0000
                                                          R-squared       =     0.8494
                                                          Adj R-squared   =     0.8419
                                                          Within R-sq.    =     0.5170
                                                          Root MSE        =  1478.4304
        
        --------------------------------------------------------------------------------------
                             |               Robust
                      depvar |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        ---------------------+----------------------------------------------------------------
              which#c.weight |
                          1  |  -.0067745   .0008357    -8.11   0.000    -.0084267   -.0051222
                          2  |   2.328626   .6200661     3.76   0.000     1.102721     3.55453
                             |
        which#c.displacement |
                          1  |   .0019286   .0073599     0.26   0.794    -.0126223    .0164796
                          2  |   10.25387     5.3106     1.93   0.056    -.2454767    20.75321
                             |
                       _cons |  -1423.811   699.1867    -2.04   0.044     -2806.14   -41.48112
        --------------------------------------------------------------------------------------
        
        Absorbed degrees of freedom:
        ----------------------------------------------------------+
              Absorbed FE | Categories  - Redundant  = Num. Coefs |
        ------------------+---------------------------------------|
           foreign#which1 |         4           0           4     |
           foreign#which2 |         4           4           0     |
        ----------------------------------------------------------+
        
        .
        Last edited by Andrew Musau; 01 Sep 2021, 11:56.

        Comment


        • #19
          Hi Andrew, yes sorry I forgot about the interactions of the absorbed variables in the example. The thing is I like having the constants as in my regressions the main slope is the "female" coefficient so then the constant gives me the value for "male" alone. And that was fine as long as I could use reg, except this one specifications seems to have too many controls for reg..

          Comment


          • #20
            The constant is meaningless in a fixed effects model. It is collinear with the fixed effects and therefore its coefficient is not identified. So you should not look to interpret it. reghdfe partials it out. Below, in regress, you see that its coefficient depends on the base category of the firm indicators.

            Code:
            webuse grunfeld
            regress invest mvalue kstock i.company
            regress invest mvalue kstock ib2.company
            Res.:

            Code:
            . webuse grunfeld
            
            . regress invest mvalue kstock i.company
            
                  Source |       SS           df       MS      Number of obs   =       200
            -------------+----------------------------------   F(11, 188)      =    288.50
                   Model |   8836465.8        11  803315.073   Prob > F        =    0.0000
                Residual |  523478.114       188  2784.45805   R-squared       =    0.9441
            -------------+----------------------------------   Adj R-squared   =    0.9408
                   Total |  9359943.92       199  47034.8941   Root MSE        =    52.768
            
            ------------------------------------------------------------------------------
                  invest |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
                  mvalue |   .1101238   .0118567     9.29   0.000     .0867345    .1335131
                  kstock |   .3100653   .0173545    17.87   0.000     .2758308    .3442999
                         |
                 company |
                      2  |   172.2025   31.16126     5.53   0.000     110.7319    233.6732
                      3  |  -165.2751   31.77556    -5.20   0.000    -227.9576   -102.5927
                      4  |    42.4874   43.90987     0.97   0.334    -44.13197    129.1068
                      5  |  -44.32013   50.49225    -0.88   0.381    -143.9243    55.28406
                      6  |   47.13539   46.81068     1.01   0.315    -45.20629    139.4771
                      7  |   3.743212   50.56493     0.07   0.941    -96.00433    103.4908
                      8  |   12.75103   44.05263     0.29   0.773    -74.14994      99.652
                      9  |  -16.92558   48.45326    -0.35   0.727    -112.5075    78.65636
                     10  |   63.72884   50.33023     1.27   0.207    -35.55572    163.0134
                         |
                   _cons |  -70.29669   49.70796    -1.41   0.159    -168.3537    27.76035
            ------------------------------------------------------------------------------
            
            . regress invest mvalue kstock ib2.company
            
                  Source |       SS           df       MS      Number of obs   =       200
            -------------+----------------------------------   F(11, 188)      =    288.50
                   Model |   8836465.8        11  803315.073   Prob > F        =    0.0000
                Residual |  523478.114       188  2784.45805   R-squared       =    0.9441
            -------------+----------------------------------   Adj R-squared   =    0.9408
                   Total |  9359943.92       199  47034.8941   Root MSE        =    52.768
            
            ------------------------------------------------------------------------------
                  invest |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
                  mvalue |   .1101238   .0118567     9.29   0.000     .0867345    .1335131
                  kstock |   .3100653   .0173545    17.87   0.000     .2758308    .3442999
                         |
                 company |
                      1  |  -172.2025   31.16126    -5.53   0.000    -233.6732   -110.7319
                      3  |  -337.4777   16.80518   -20.08   0.000    -370.6286   -304.3267
                      4  |  -129.7151   21.97637    -5.90   0.000    -173.0671   -86.36315
                      5  |  -216.5226   27.69626    -7.82   0.000     -271.158   -161.8873
                      6  |  -125.0671   24.12802    -5.18   0.000    -172.6636   -77.47068
                      7  |  -168.4593   27.40332    -6.15   0.000    -222.5168   -114.4018
                      8  |  -159.4515    22.0766    -7.22   0.000    -203.0012   -115.9018
                      9  |  -189.1281   25.62201    -7.38   0.000    -239.6717   -138.5845
                     10  |  -108.4737   26.95322    -4.02   0.000    -161.6433   -55.30405
                         |
                   _cons |   101.9058   24.93832     4.09   0.000     52.71093    151.1007
            ------------------------------------------------------------------------------
            
            .

            Comment


            • #21
              Predict the DV after suest, get corrrelation of yfit and y, square it, call it pseudoR2?

              Comment


              • #22
                Thanks George!

                And thanks Andrew for the clarification, that makes sense. Although I guess I am still a bit confused by the fact that even in the simplest model without fixed effects reghdfe still seems to exclude a constant while reg doesn't. So excluding the foreign fixed effects in my example in #17, I now get

                Code:
                . regress depvar i.which#(c.weight c.displacement c.cons), nocons
                
                      Source |       SS           df       MS      Number of obs   =       148
                -------------+----------------------------------   F(6, 142)       =    157.55
                       Model |  2.9976e+09         6   499595356   Prob > F        =    0.0000
                    Residual |   450298194       142  3171114.04   R-squared       =    0.8694
                -------------+----------------------------------   Adj R-squared   =    0.8639
                       Total |  3.4479e+09       148  23296421.1   Root MSE        =    1780.8
                
                --------------------------------------------------------------------------------------
                              depvar |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                ---------------------+----------------------------------------------------------------
                      which#c.weight |
                                  1  |  -.0065671   .6009143    -0.01   0.991    -1.194461    1.181327
                                  2  |   1.823366   .6009143     3.03   0.003     .6354719     3.01126
                                     |
                which#c.displacement |
                                  1  |   .0052808   5.085375     0.00   0.999    -10.04755    10.05811
                                  2  |   2.087054   5.085375     0.41   0.682    -7.965772    12.13988
                                     |
                        which#c.cons |
                                  1  |   40.08452   1040.877     0.04   0.969    -2017.533    2097.702
                                  2  |    247.907   1040.877     0.24   0.812    -1809.711    2305.525
                --------------------------------------------------------------------------------------
                
                . reghdfe depvar i.which#(c.weight c.displacement c.cons), nocons noabsorb
                (MWFE estimator converged in 1 iterations)
                note: 2.which#c.cons omitted because of collinearity
                
                HDFE Linear regression                            Number of obs   =        148
                Absorbing 1 HDFE group                            F(   5,    142) =      99.74
                                                                  Prob > F        =     0.0000
                                                                  R-squared       =     0.7784
                                                                  Adj R-squared   =     0.7706
                                                                  Within R-sq.    =     0.7784
                                                                  Root MSE        =  1780.7622
                
                --------------------------------------------------------------------------------------
                              depvar |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                ---------------------+----------------------------------------------------------------
                      which#c.weight |
                                  1  |  -.0065671   .6009143    -0.01   0.991    -1.194461    1.181327
                                  2  |   1.823366   .6009143     3.03   0.003     .6354719     3.01126
                                     |
                which#c.displacement |
                                  1  |   .0052808   5.085375     0.00   0.999    -10.04755    10.05811
                                  2  |   2.087054   5.085375     0.41   0.682    -7.965772    12.13988
                                     |
                        which#c.cons |
                                  1  |  -207.8225   1472.023    -0.14   0.888    -3117.733    2702.088
                                  2  |          0  (omitted)
                --------------------------------------------------------------------------------------
                And again, I am happy with why I should not consider the constant in my models, but I am afraid I might be missing something of how reghdfe works.

                I should add I am working with version 5.7.3 (nov2019) of reghdfe.
                Last edited by Maria Ventura; 02 Sep 2021, 01:39.

                Comment


                • #23
                  As reghdfe was written mainly for efficient estimation of linear regression with multiple levels of fixed effects, the author did not anticipate that it would be used for standard linear regression. Therefore, it has no true -nocons- option. Its -nocons- option is more like -hidecons-.

                  Comment


                  • #24
                    I see, I did not know that, it all makes sense now. Thanks!

                    Comment


                    • #25
                      Hey Andrew Musau, sorry to get back on this but I figured a few more things while working on it.

                      First, it seems that the constant in reghdfe is such that the sum of fixed effects is zero (apparently it is the same in xtreg),which would mean that it can actually be interpreted?

                      Secondly, I was wondering if you had references for the equivalence of the stacked samples estimation of a few posts ago and SUR models. I have seen this https://journals.sagepub.com/doi/pdf...867X0400400407 but it seems to require a few other steps (I am actually not sure of why the author is a using a whole different command rather than doing what you suggested above?).

                      Thanks!
                      Last edited by Maria Ventura; 15 Sep 2021, 07:39.

                      Comment


                      • #26
                        Have a look at the following FAQ by Bill Gould. The intercept in a fixed effects model just reflects the kind of constraint you place on the model. There is no way to separately identify the intercept and the fixed effects, and therefore as I said, it is meaningless.

                        https://www.stata.com/support/faqs/s...effects-model/

                        For references on stacked regressions, have a look at the references within the Stata manual entry of suest as the procedure is exactly what suest does.

                        Comment


                        • #27
                          The reference to back #26 is found on page 15 of the suest manual


                          The stacking method can be applied not only to the testing of cross-model hypotheses for logit models but also to any estimation command that supports the vce(cluster clustvar) option. The stacking approach clearly generalizes to stacking more than two logit or other models, testing more general linear hypotheses, and testing nonlinear cross-model hypotheses (see [R] testnl). In all of these cases, suest would yield identical statistical results but at smaller costs in terms of data management, computer storage, and computer time.

                          Comment


                          • #28
                            Thanks Andrew, that's incredibly useful!

                            Comment

                            Working...
                            X