Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Number of Estimated Coefficients in a Regression

    Is there a way to display the number coefficients estimated in a regression? I am estimating different models by adding various covariates and would like to store the number of regressors from each regression in a local. I thought maybe I could use the number of columns of e(b) but I don't know how to obtain the number of columns of e(b).

  • #2
    Code:
    sysuse auto, clear
    
    
    cls
    
    
    qui reg price weight head
    
    mat l e(b)
    
    loc cols = colsof(e(b))
    
    di `cols'
    Or if you wanna know the number of predictors aside from the constant
    Code:
    sysuse auto, clear
    
    
    cls
    
    
    qui reg price weight head
    
    mat l e(b)
    
    loc cols = colsof(e(b))-1
    
    di `cols'

    Comment


    • #3
      A perhaps simpler way to get the number of predictors other than the constant is
      Code:
      regress whatever
      display e(df_m)
      If you want to include the constant term, just add 1.

      Added: An advantage of this approach is that it works correctly after, as far as I know, all Stata estimation commands. By contrast, with multi-level models, the approach based on columns of the e(b) matrix will include the random effects as well, which is probably not what you want in most situations.
      Last edited by Clyde Schechter; 03 Jun 2022, 15:28.

      Comment


      • #4
        Oh yeah I forgot about df_m, yes this way is much better than mine

        Comment


        • #5
          If you use e(df_m) or any other degrees of freedom based statistic, you need to be careful if there is clustering as this affects the reported degrees of freedom. In xtreg with the -fe- option, e(df_b) will give you the count excluding the fixed effects in the absence of clustering (the displayed coefficients). There are some macros which store the names of the regressors, so one can count the number of words, but this is complicated by factor variable expansion if factor variable notation is used. Adjustment for the degrees of freedom under clustering is always possible if you know what you are doing.

          Code:
          webuse grunfeld, clear
          xtreg invest mvalue kstock i.time, fe
          di e(df_m)
          di e(df_b)
          xtreg invest mvalue kstock i.time, fe cluster(company)
          di e(df_m)
          di e(df_b)
          Res.:

          Code:
          . xtreg invest mvalue kstock i.time, fe
          
          Fixed-effects (within) regression               Number of obs     =        200
          Group variable: company                         Number of groups  =         10
          
          R-sq:                                           Obs per group:
               within  = 0.7985                                         min =         20
               between = 0.8143                                         avg =       20.0
               overall = 0.8068                                         max =         20
          
                                                          F(21,169)         =      31.90
          corr(u_i, Xb)  = -0.3250                        Prob > F          =     0.0000
          
          ------------------------------------------------------------------------------
                invest |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                mvalue |   .1177158   .0137513     8.56   0.000     .0905694    .1448623
                kstock |   .3579163    .022719    15.75   0.000     .3130667    .4027659
                       |
                  time |
                    2  |  -19.19741   23.67586    -0.81   0.419    -65.93593    27.54112
                    3  |  -40.69001   24.69541    -1.65   0.101    -89.44122    8.061213
                    4  |   -39.2264   23.23594    -1.69   0.093    -85.09647    6.643667
                    5  |  -69.47029   23.65607    -2.94   0.004    -116.1698   -22.77083
                    6  |  -44.23507   23.80979    -1.86   0.065      -91.238     2.76785
                    7  |  -18.80446     23.694    -0.79   0.429     -65.5788    27.96987
                    8  |  -21.13979   23.38163    -0.90   0.367    -67.29748    25.01789
                    9  |  -42.97762   23.55287    -1.82   0.070    -89.47334    3.518104
                   10  |  -43.09876    23.6102    -1.83   0.070    -89.70766    3.510134
                   11  |  -55.68303   23.89561    -2.33   0.021    -102.8554   -8.510689
                   12  |  -31.16928   24.11598    -1.29   0.198    -78.77665    16.43809
                   13  |  -39.39223   23.78368    -1.66   0.100    -86.34361    7.559141
                   14  |  -43.71651   23.96965    -1.82   0.070    -91.03501    3.601991
                   15  |   -73.4951   24.18292    -3.04   0.003    -121.2346   -25.75559
                   16  |  -75.89611   24.34553    -3.12   0.002    -123.9566    -27.8356
                   17  |   -62.4809   24.86425    -2.51   0.013    -111.5654   -13.39637
                   18  |  -64.63233    25.3495    -2.55   0.012    -114.6748   -14.58987
                   19  |  -67.71796   26.61108    -2.54   0.012    -120.2509   -15.18501
                   20  |  -93.52622   27.10786    -3.45   0.001    -147.0399   -40.01257
                       |
                 _cons |  -32.83631   18.87533    -1.74   0.084     -70.0981    4.425483
          -------------+----------------------------------------------------------------
               sigma_u |  91.798268
               sigma_e |  51.724523
                   rho |  .75902159   (fraction of variance due to u_i)
          ------------------------------------------------------------------------------
          F test that all u_i=0: F(9, 169) = 52.36                     Prob > F = 0.0000
          
          . 
          . di e(df_m)
          30
          
          . 
          . di e(df_b)
          21
          
          . 
          . xtreg invest mvalue kstock i.time, fe cluster(company)
          
          Fixed-effects (within) regression               Number of obs     =        200
          Group variable: company                         Number of groups  =         10
          
          R-sq:                                           Obs per group:
               within  = 0.7985                                         min =         20
               between = 0.8143                                         avg =       20.0
               overall = 0.8068                                         max =         20
          
                                                          F(9,9)            =          .
          corr(u_i, Xb)  = -0.3250                        Prob > F          =          .
          
                                         (Std. Err. adjusted for 10 clusters in company)
          ------------------------------------------------------------------------------
                       |               Robust
                invest |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                mvalue |   .1177158   .0108244    10.88   0.000     .0932293    .1422024
                kstock |   .3579163   .0478484     7.48   0.000     .2496757    .4661569
                       |
                  time |
                    2  |  -19.19741   20.69857    -0.93   0.378    -66.02082    27.62601
                    3  |  -40.69001   33.28318    -1.22   0.253    -115.9818    34.60179
                    4  |   -39.2264   15.73649    -2.49   0.034    -74.82481   -3.627994
                    5  |  -69.47029   26.99875    -2.57   0.030    -130.5457   -8.394865
                    6  |  -44.23507    17.3723    -2.55   0.031    -83.53394    -4.93621
                    7  |  -18.80446   17.84747    -1.05   0.320    -59.17824    21.56931
                    8  |  -21.13979   14.16477    -1.49   0.170    -53.18273    10.90314
                    9  |  -42.97762   12.54411    -3.43   0.008    -71.35436   -14.60088
                   10  |  -43.09876   10.99586    -3.92   0.004    -67.97313    -18.2244
                   11  |  -55.68303    15.2019    -3.66   0.005    -90.07212   -21.29394
                   12  |  -31.16928   20.91692    -1.49   0.170    -78.48663    16.14807
                   13  |  -39.39223   26.43707    -1.49   0.170    -99.19704    20.41257
                   14  |  -43.71651   38.87861    -1.12   0.290     -131.666    44.23301
                   15  |   -73.4951    38.2545    -1.92   0.087    -160.0328    13.04259
                   16  |  -75.89611   36.79846    -2.06   0.069      -159.14    7.347783
                   17  |   -62.4809   49.41812    -1.26   0.238    -174.2725    49.31066
                   18  |  -64.63233   51.56208    -1.25   0.242    -181.2739     52.0092
                   19  |  -67.71796   43.74465    -1.55   0.156    -166.6752    31.23932
                   20  |  -93.52622   31.72632    -2.95   0.016    -165.2961   -21.75629
                       |
                 _cons |  -32.83631   19.78259    -1.66   0.131    -77.58765    11.91503
          -------------+----------------------------------------------------------------
               sigma_u |  91.798268
               sigma_e |  51.724523
                   rho |  .75902159   (fraction of variance due to u_i)
          ------------------------------------------------------------------------------
          
          . 
          . di e(df_m)
          8
          
          . 
          . di e(df_b)
          9

          Comment


          • #6
            Good point. I didn't think about clustered vce. So it seems there is no simple solution that works for all estimation commands.

            Comment


            • #7
              Andrew, that is a great point. I was wondering why I get so few coefficients if I use e(df_m) or e(rank) with clustered standard errors. The best solutions is to use colsof(e(b)) as Jared suggested if you have clustered standard errors. It worked great for me.

              Comment

              Working...
              X