ivreghdfe cannot show first stage outcome

Yuan Tang

Join Date: Jun 2019

Posts: 6
#1

ivreghdfe cannot show first stage outcome

02 Jul 2019, 19:02

Dear statalist,

I am using ivreghdfe and want to know the regression in the first stage. However, stata shows that unable to store frist stage regression, code is as follows. Could someone help? Thanks a lot.

ivreghdfe net_revenue_log if_comment_lag overall_rating_lag (yelp_rating_if_comment_lag = other_bus_rating_if_comment_lag), first absorb(prov_id surveyyear) cl(prov_id)
(MWFE estimator converged in 2 iterations)

Unable to store first-stage regression of yelp_rating_if_comment_lag.

First-stage regressions
-----------------------

Unable to display all first-stage regressions.
There may be insufficient room to store results using -estimates store-,
or names of endogenous regressors may be too long to store the results.
Try dropping one or more estimation results using -estimates drop-,
using the -savefprefix- option, or using shorter variable names.
Tags: fixed effects, ivreg2
Andrew Musau

Join Date: Oct 2014

Posts: 10191
#2

03 Jul 2019, 08:08

ivreghdfe is from SSC, as you are asked to explain in the FAQs. I think that the error message issued by the command clearly illustrates your problem.

Unable to display all first-stage regressions.
There may be insufficient room to store results using -estimates store-,
or names of endogenous regressors may be too long to store the results.
Try dropping one or more estimation results using -estimates drop-,
using the -savefprefix- option, or using shorter variable names.

Try the following:

Code:

rename (net_revenue_log_if_comment_lag overall_rating_lag yelp_rating_if_comment_lag other_bus_rating_if_comment_lag) (n_revenue ov_rating y_rating o_rating) ivreghdfe n_revenue ov_rating (y_rating = o_rating), first absorb(prov_id surveyyear) cl(prov_id)

Last edited by Andrew Musau; 03 Jul 2019, 08:13.
Comment
Adnan Habib

Join Date: Feb 2018

Posts: 10
#3

25 Apr 2020, 22:19

Originally posted by Andrew Musau View Post

ivreghdfe is from SSC, as you are asked to explain in the FAQs. I think that the error message issued by the command clearly illustrates your problem.

Try the following:

Code:

rename (net_revenue_log_if_comment_lag overall_rating_lag yelp_rating_if_comment_lag other_bus_rating_if_comment_lag) (n_revenue ov_rating y_rating o_rating) ivreghdfe n_revenue ov_rating (y_rating = o_rating), first absorb(prov_id surveyyear) cl(prov_id)

Hi Andrew,

I have a similar issue. When I use the code you provided I get the first stage results but it still shows invalid syntax error after the Hansen J statistic is displayed. Is it something to do with the other post-estimation test results?

I have another question, I hope you will be able to answer. I want to create a rtf file that reports the results of both first stage and second stage along with post-estimation test results for first stage in a single table using esttab. Do you know how to do that?
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10191

26 Apr 2020, 05:21

I have a similar issue. When I use the code you provided I get the first stage results but it still shows invalid syntax error after the Hansen J statistic is displayed. Is it something to do with the other post-estimation test results?

I cannot advise on this without a reproducible example as I have no idea what is going on.

I have another question, I hope you will be able to answer. I want to create a rtf file that reports the results of both first stage and second stage along with post-estimation test results for first stage in a single table using esttab. Do you know how to do that?

esttab is from Stata Journal / SSC, as you are asked to explain in FAQ Advice #12. Here is a reproducible example. Say we want to report the Craig-Donald Wald F statistic and the Stock-Wright LM S statistic from the first stage (highlighted in red below).

Code:

. use http://fmwww.bc.edu/ec-p/data/wooldridge/mroz.dta, clear

. ivreg2 lwage exper expersq (educ=age kidslt6 kidsge6), first

First-stage regressions
-----------------------


First-stage regression of educ:

Statistics consistent for homoskedasticity only
Number of obs =                    428
------------------------------------------------------------------------------
        educ |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   -.012015   .0178791    -0.67   0.502    -.0471582    .0231281
     kidslt6 |   .7254425   .2979333     2.43   0.015     .1398244    1.311061
     kidsge6 |  -.2219447    .093675    -2.37   0.018    -.4060725   -.0378169
       exper |   .0501414   .0452026     1.11   0.268    -.0387088    .1389916
     expersq |  -.0017657   .0013568    -1.30   0.194    -.0044326    .0009012
       _cons |   13.12194   .8598407    15.26   0.000     11.43184    14.81204
------------------------------------------------------------------------------
F test of excluded instruments:
  F(  3,   422) =     4.34
  Prob > F      =   0.0050
Sanderson-Windmeijer multivariate F test of excluded instruments:
  F(  3,   422) =     4.34
  Prob > F      =   0.0050



Summary results for first-stage regressions
-------------------------------------------

                                           (Underid)            (Weak id)
Variable     | F(  3,   422)  P-val | SW Chi-sq(  3) P-val | SW F(  3,   422)
educ         |       4.34    0.0050 |       13.21   0.0042 |        4.34

Stock-Yogo weak ID F test critical values for single endogenous regressor:
                                    5% maximal IV relative bias    13.91
                                   10% maximal IV relative bias     9.08
                                   20% maximal IV relative bias     6.46
                                   30% maximal IV relative bias     5.39
                                   10% maximal IV size             22.30
                                   15% maximal IV size             12.83
                                   20% maximal IV size              9.54
                                   25% maximal IV size              7.80
Source: Stock-Yogo (2005).  Reproduced by permission.
NB: Critical values are for Sanderson-Windmeijer F statistic.

Underidentification test
Ho: matrix of reduced form coefficients has rank=K1-1 (underidentified)
Ha: matrix has rank=K1 (identified)
Anderson canon. corr. LM statistic       Chi-sq(3)=12.82    P-val=0.0051

Weak identification test
Ho: equation is weakly identified
Cragg-Donald Wald F statistic                                       4.34

Stock-Yogo weak ID test critical values for K1=1 and L1=3:
                                    5% maximal IV relative bias    13.91
                                   10% maximal IV relative bias     9.08
                                   20% maximal IV relative bias     6.46
                                   30% maximal IV relative bias     5.39
                                   10% maximal IV size             22.30
                                   15% maximal IV size             12.83
                                   20% maximal IV size              9.54
                                   25% maximal IV size              7.80
Source: Stock-Yogo (2005).  Reproduced by permission.

Weak-instrument-robust inference
Tests of joint significance of endogenous regressors B1 in main equation
Ho: B1=0 and orthogonality conditions are valid
Anderson-Rubin Wald test           F(3,422)=       0.61     P-val=0.6076
Anderson-Rubin Wald test           Chi-sq(3)=      1.86     P-val=0.6016
Stock-Wright LM S statistic        Chi-sq(3)=      1.85     P-val=0.6033

Number of observations               N  =        428
Number of regressors                 K  =          4
Number of endogenous regressors      K1 =          1
Number of instruments                L  =          6
Number of excluded instruments       L1 =          3

IV (2SLS) estimation
--------------------

Estimates efficient for homoskedasticity only
Statistics consistent for homoskedasticity only

                                                      Number of obs =      428
                                                      F(  3,   424) =     7.49
                                                      Prob > F      =   0.0001
Total (centered) SS     =  223.3274513                Centered R2   =   0.1556
Total (uncentered) SS   =   829.594813                Uncentered R2 =   0.7727
Residual SS             =  188.5780571                Root MSE      =    .6638

------------------------------------------------------------------------------
       lwage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        educ |   .0964002   .0814278     1.18   0.236    -.0631952    .2559957
       exper |    .042193   .0138831     3.04   0.002     .0149827    .0694033
     expersq |  -.0008323   .0004204    -1.98   0.048    -.0016563   -8.33e-06
       _cons |  -.3848718   1.011551    -0.38   0.704    -2.367476    1.597732
------------------------------------------------------------------------------
Underidentification test (Anderson canon. corr. LM statistic):          12.816
                                                   Chi-sq(3) P-val =    0.0051
------------------------------------------------------------------------------
Weak identification test (Cragg-Donald Wald F statistic):                4.342
Stock-Yogo weak ID test critical values:  5% maximal IV relative bias    13.91
                                         10% maximal IV relative bias     9.08
                                         20% maximal IV relative bias     6.46
                                         30% maximal IV relative bias     5.39
                                         10% maximal IV size             22.30
                                         15% maximal IV size             12.83
                                         20% maximal IV size              9.54
                                         25% maximal IV size              7.80
Source: Stock-Yogo (2005).  Reproduced by permission.
------------------------------------------------------------------------------
Sargan statistic (overidentification test of all instruments):           0.702
                                                   Chi-sq(2) P-val =    0.7042
------------------------------------------------------------------------------
Instrumented:         educ
Included instruments: exper expersq
Excluded instruments: age kidslt6 kidsge6
------------------------------------------------------------------------------

These are available from ereturn list, so we just need to know the names of their corresponding scalars (again highlighted in red below)

Code:

. ereturn list

scalars:
                e(yyc) =  223.3274512515483
                 e(yy) =  829.594812951225
               e(Fdf2) =  424
               e(Fdf1) =  3
                 e(Fp) =  .0000674027518006
                  e(F) =  7.493900655892364
               e(r2_a) =  .149623860675329
                e(mss) =  34.74939415689639
                e(rss) =  188.5780570946519
               e(rmse) =  .6637792834497301
                 e(r2) =  .155598400295877
             e(center) =  0
           e(dofminus) =  0
          e(sdofminus) =  0
               e(df_m) =  3
         e(partial_ct) =  0
           e(endog_ct) =  1
          e(exexog_ct) =  3
          e(inexog_ct) =  2
                  e(N) =  428
               e(cons) =  1
        e(partialcons) =  0
           e(nocollin) =  0
            e(sstatdf) =  3
             e(sstatp) =  .6033182331591862
              e(sstat) =  1.853705929467335
             e(ardf_r) =  422
               e(ardf) =  3
            e(archi2p) =  .6015863343873346
             e(archi2) =  1.861769417806341
               e(arfp) =  .6075953079368361
                e(arf) =  .6118899488428938
                e(cdf) =  4.342070862428496
            e(widstat) =  4.342070862428496
                 e(cd) =  .0308678023395391
                e(idp) =  .0050523097869339
               e(iddf) =  3
             e(idstat) =  12.81582310684222
            e(sarganp) =  .7041553948545243
           e(sargandf) =  2
             e(sargan) =  .7015124317084412
                 e(jp) =  .7041553948545243
                e(jdf) =  2
                  e(j) =  .7015124317084412
                 e(ll) =  -431.908900030797
              e(rankV) =  4
              e(rankS) =  6
             e(rankxx) =  4
             e(rankzz) =  6
             e(condxx) =  4893604.834519811
             e(condzz) =  8161817.434887111
                e(r2c) =  .155598400295877
                e(r2u) =  .7726865523377626

Once we have these, we create new scalars to hold these values and thereafter include these in the esttab command. So in all

Code:

use http://fmwww.bc.edu/ec-p/data/wooldridge/mroz.dta, clear
eststo clear
eststo: ivreg2 lwage exper expersq (educ=age kidslt6 kidsge6), first savefirst savefprefix(s1)
estadd scalar cdf1 =  `e(cdf)': s1educ
estadd scalar sstat1 = `e(sstat)': s1educ
esttab s1educ est1, stats(cdf1 sstat1, labels("CD Wald F" "SW S stat."))

[OUTPUT AS rtf]

Code:

esttab s1educ est1 using myfile.rtf, stats(cdf1 sstat1, labels("CD Wald F" "SW S stat."))

Res.:

Code:

. esttab s1educ est1, stats(cdf1 sstat1, labels("CD Wald F" "SW S stat."))

--------------------------------------------
                      (1)             (2)   
                     educ           lwage   
--------------------------------------------
age               -0.0120                   
                  (-0.67)                   

kidslt6             0.725*                  
                   (2.43)                   

kidsge6            -0.222*                  
                  (-2.37)                   

exper              0.0501          0.0422** 
                   (1.11)          (3.04)   

expersq          -0.00177       -0.000832*  
                  (-1.30)         (-1.98)   

educ                               0.0964   
                                   (1.18)   

_cons               13.12***       -0.385   
                  (15.26)         (-0.38)   
--------------------------------------------
CD Wald F           4.342                   
SW S stat.          1.854                   
--------------------------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001

Comment

Matthew Williams

Join Date: Feb 2021

Posts: 195
#5

25 Mar 2021, 19:25

Dear Andrew Musau,

Thank you for your code in #4. I have learned a lot from them. However, do you know how to report R adjusted squared of the first stage? It seems that e(r2_a) from return list in #4 is R adjusted squared of the second stage, not the first one.

Thank you.
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10191

26 Mar 2021, 01:47

It does not appear that the statistic is stored. You just have to quietly run the first-stage regression and save it.

Code:

use http://fmwww.bc.edu/ec-p/data/wooldridge/mroz.dta, clear
eststo clear
eststo m1: ivreg2 lwage exper expersq (educ=age kidslt6 kidsge6), first savefirst savefprefix(s1)
estadd scalar cdf1 =  `e(cdf)': s1educ
estadd scalar sstat1 = `e(sstat)': s1educ
qui reg `e(instd)' `e(insts)' if e(sample)
local r2a= e(r2_a)
est restore m1
estadd scalar r2a =  `r2a': s1educ
esttab s1educ m1, stats(cdf1 sstat1 r2a, labels("CD Wald F" "SW S stat." "Adj. R2"))

Res.:

Code:

. esttab s1educ m1, stats(cdf1 sstat1 r2a, labels("CD Wald F" "SW S stat." "Adj. R2"))

--------------------------------------------
                      (1)             (2)  
                     educ           lwage  
--------------------------------------------
age               -0.0120                  
                  (-0.67)                  

kidslt6             0.725*                  
                   (2.43)                  

kidsge6            -0.222*                  
                  (-2.37)                  

exper              0.0501          0.0422**
                   (1.11)          (3.04)  

expersq          -0.00177       -0.000832*  
                  (-1.30)         (-1.98)  

educ                               0.0964  
                                   (1.18)  

_cons               13.12***       -0.385  
                  (15.26)         (-0.38)  
--------------------------------------------
CD Wald F           4.342                  
SW S stat.          1.854                  
Adj. R2            0.0233                  
--------------------------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001

where

Code:

. reg `e(instd)' `e(insts)' if e(sample)

      Source |       SS           df       MS      Number of obs   =       428
-------------+----------------------------------   F(5, 422)       =      3.04
       Model |  77.4310057         5  15.4862011   Prob > F        =    0.0105
    Residual |  2152.76526       422  5.10133947   R-squared       =    0.0347
-------------+----------------------------------   Adj R-squared   =    0.0233
       Total |  2230.19626       427  5.22294206   Root MSE        =    2.2586

------------------------------------------------------------------------------
        educ |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   -.012015   .0178791    -0.67   0.502    -.0471582    .0231281
     kidslt6 |   .7254425   .2979333     2.43   0.015     .1398244    1.311061
     kidsge6 |  -.2219447    .093675    -2.37   0.018    -.4060725   -.0378169
       exper |   .0501414   .0452026     1.11   0.268    -.0387088    .1389916
     expersq |  -.0017657   .0013568    -1.30   0.194    -.0044326    .0009012
       _cons |   13.12194   .8598407    15.26   0.000     11.43184    14.81204
------------------------------------------------------------------------------

Comment

Matthew Williams

Join Date: Feb 2021

Posts: 195
#7

26 Mar 2021, 08:35

Excellent code as always. Many thanks Andrew Musau
Comment
Matthew Williams

Join Date: Feb 2021

Posts: 195
#8

26 Mar 2021, 10:06

Dear Andrew Musau,

Sorry for bothering you again but do you know there is any option in -outreg2- command (SSC) that is similar to onecell option in -esttab- command? I know that the onecell option in -esstab- can put coefficients and their associated standard errors in one row but I have found a similar one in -outreg2-. That would make me transfer estimates from files to files faster.

Thank you.
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10191
#9

26 Mar 2021, 15:45

I think the -sideway- option of outreg2 does that.
1 like
Comment

Announcement