Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • ivreghdfe cannot show first stage outcome

    Dear statalist,

    I am using ivreghdfe and want to know the regression in the first stage. However, stata shows that unable to store frist stage regression, code is as follows. Could someone help? Thanks a lot.

    ivreghdfe net_revenue_log if_comment_lag overall_rating_lag (yelp_rating_if_comment_lag = other_bus_rating_if_comment_lag), first absorb(prov_id surveyyear) cl(prov_id)
    (MWFE estimator converged in 2 iterations)

    Unable to store first-stage regression of yelp_rating_if_comment_lag.


    First-stage regressions
    -----------------------

    Unable to display all first-stage regressions.
    There may be insufficient room to store results using -estimates store-,
    or names of endogenous regressors may be too long to store the results.
    Try dropping one or more estimation results using -estimates drop-,
    using the -savefprefix- option, or using shorter variable names.

  • #2
    ivreghdfe is from SSC, as you are asked to explain in the FAQs. I think that the error message issued by the command clearly illustrates your problem.

    Unable to display all first-stage regressions.
    There may be insufficient room to store results using -estimates store-,
    or names of endogenous regressors may be too long to store the results.
    Try dropping one or more estimation results using -estimates drop-,
    using the -savefprefix- option, or using shorter variable names.
    Try the following:

    Code:
    rename (net_revenue_log_if_comment_lag overall_rating_lag yelp_rating_if_comment_lag other_bus_rating_if_comment_lag) (n_revenue ov_rating y_rating o_rating)
    ivreghdfe n_revenue ov_rating (y_rating = o_rating), first absorb(prov_id surveyyear) cl(prov_id)
    Last edited by Andrew Musau; 03 Jul 2019, 08:13.

    Comment


    • #3
      Originally posted by Andrew Musau View Post
      ivreghdfe is from SSC, as you are asked to explain in the FAQs. I think that the error message issued by the command clearly illustrates your problem.



      Try the following:

      Code:
      rename (net_revenue_log_if_comment_lag overall_rating_lag yelp_rating_if_comment_lag other_bus_rating_if_comment_lag) (n_revenue ov_rating y_rating o_rating)
      ivreghdfe n_revenue ov_rating (y_rating = o_rating), first absorb(prov_id surveyyear) cl(prov_id)
      Hi Andrew,

      I have a similar issue. When I use the code you provided I get the first stage results but it still shows invalid syntax error after the Hansen J statistic is displayed. Is it something to do with the other post-estimation test results?


      I have another question, I hope you will be able to answer. I want to create a rtf file that reports the results of both first stage and second stage along with post-estimation test results for first stage in a single table using esttab. Do you know how to do that?

      Comment


      • #4
        I have a similar issue. When I use the code you provided I get the first stage results but it still shows invalid syntax error after the Hansen J statistic is displayed. Is it something to do with the other post-estimation test results?
        I cannot advise on this without a reproducible example as I have no idea what is going on.

        I have another question, I hope you will be able to answer. I want to create a rtf file that reports the results of both first stage and second stage along with post-estimation test results for first stage in a single table using esttab. Do you know how to do that?
        esttab is from Stata Journal / SSC, as you are asked to explain in FAQ Advice #12. Here is a reproducible example. Say we want to report the Craig-Donald Wald F statistic and the Stock-Wright LM S statistic from the first stage (highlighted in red below).

        Code:
        . use http://fmwww.bc.edu/ec-p/data/wooldridge/mroz.dta, clear
        
        . ivreg2 lwage exper expersq (educ=age kidslt6 kidsge6), first
        
        First-stage regressions
        -----------------------
        
        
        First-stage regression of educ:
        
        Statistics consistent for homoskedasticity only
        Number of obs =                    428
        ------------------------------------------------------------------------------
                educ |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
                 age |   -.012015   .0178791    -0.67   0.502    -.0471582    .0231281
             kidslt6 |   .7254425   .2979333     2.43   0.015     .1398244    1.311061
             kidsge6 |  -.2219447    .093675    -2.37   0.018    -.4060725   -.0378169
               exper |   .0501414   .0452026     1.11   0.268    -.0387088    .1389916
             expersq |  -.0017657   .0013568    -1.30   0.194    -.0044326    .0009012
               _cons |   13.12194   .8598407    15.26   0.000     11.43184    14.81204
        ------------------------------------------------------------------------------
        F test of excluded instruments:
          F(  3,   422) =     4.34
          Prob > F      =   0.0050
        Sanderson-Windmeijer multivariate F test of excluded instruments:
          F(  3,   422) =     4.34
          Prob > F      =   0.0050
        
        
        
        Summary results for first-stage regressions
        -------------------------------------------
        
                                                   (Underid)            (Weak id)
        Variable     | F(  3,   422)  P-val | SW Chi-sq(  3) P-val | SW F(  3,   422)
        educ         |       4.34    0.0050 |       13.21   0.0042 |        4.34
        
        Stock-Yogo weak ID F test critical values for single endogenous regressor:
                                            5% maximal IV relative bias    13.91
                                           10% maximal IV relative bias     9.08
                                           20% maximal IV relative bias     6.46
                                           30% maximal IV relative bias     5.39
                                           10% maximal IV size             22.30
                                           15% maximal IV size             12.83
                                           20% maximal IV size              9.54
                                           25% maximal IV size              7.80
        Source: Stock-Yogo (2005).  Reproduced by permission.
        NB: Critical values are for Sanderson-Windmeijer F statistic.
        
        Underidentification test
        Ho: matrix of reduced form coefficients has rank=K1-1 (underidentified)
        Ha: matrix has rank=K1 (identified)
        Anderson canon. corr. LM statistic       Chi-sq(3)=12.82    P-val=0.0051
        
        Weak identification test
        Ho: equation is weakly identified
        Cragg-Donald Wald F statistic                                       4.34
        
        Stock-Yogo weak ID test critical values for K1=1 and L1=3:
                                            5% maximal IV relative bias    13.91
                                           10% maximal IV relative bias     9.08
                                           20% maximal IV relative bias     6.46
                                           30% maximal IV relative bias     5.39
                                           10% maximal IV size             22.30
                                           15% maximal IV size             12.83
                                           20% maximal IV size              9.54
                                           25% maximal IV size              7.80
        Source: Stock-Yogo (2005).  Reproduced by permission.
        
        Weak-instrument-robust inference
        Tests of joint significance of endogenous regressors B1 in main equation
        Ho: B1=0 and orthogonality conditions are valid
        Anderson-Rubin Wald test           F(3,422)=       0.61     P-val=0.6076
        Anderson-Rubin Wald test           Chi-sq(3)=      1.86     P-val=0.6016
        Stock-Wright LM S statistic        Chi-sq(3)=      1.85     P-val=0.6033
        
        Number of observations               N  =        428
        Number of regressors                 K  =          4
        Number of endogenous regressors      K1 =          1
        Number of instruments                L  =          6
        Number of excluded instruments       L1 =          3
        
        IV (2SLS) estimation
        --------------------
        
        Estimates efficient for homoskedasticity only
        Statistics consistent for homoskedasticity only
        
                                                              Number of obs =      428
                                                              F(  3,   424) =     7.49
                                                              Prob > F      =   0.0001
        Total (centered) SS     =  223.3274513                Centered R2   =   0.1556
        Total (uncentered) SS   =   829.594813                Uncentered R2 =   0.7727
        Residual SS             =  188.5780571                Root MSE      =    .6638
        
        ------------------------------------------------------------------------------
               lwage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
                educ |   .0964002   .0814278     1.18   0.236    -.0631952    .2559957
               exper |    .042193   .0138831     3.04   0.002     .0149827    .0694033
             expersq |  -.0008323   .0004204    -1.98   0.048    -.0016563   -8.33e-06
               _cons |  -.3848718   1.011551    -0.38   0.704    -2.367476    1.597732
        ------------------------------------------------------------------------------
        Underidentification test (Anderson canon. corr. LM statistic):          12.816
                                                           Chi-sq(3) P-val =    0.0051
        ------------------------------------------------------------------------------
        Weak identification test (Cragg-Donald Wald F statistic):                4.342
        Stock-Yogo weak ID test critical values:  5% maximal IV relative bias    13.91
                                                 10% maximal IV relative bias     9.08
                                                 20% maximal IV relative bias     6.46
                                                 30% maximal IV relative bias     5.39
                                                 10% maximal IV size             22.30
                                                 15% maximal IV size             12.83
                                                 20% maximal IV size              9.54
                                                 25% maximal IV size              7.80
        Source: Stock-Yogo (2005).  Reproduced by permission.
        ------------------------------------------------------------------------------
        Sargan statistic (overidentification test of all instruments):           0.702
                                                           Chi-sq(2) P-val =    0.7042
        ------------------------------------------------------------------------------
        Instrumented:         educ
        Included instruments: exper expersq
        Excluded instruments: age kidslt6 kidsge6
        ------------------------------------------------------------------------------

        These are available from ereturn list, so we just need to know the names of their corresponding scalars (again highlighted in red below)

        Code:
        . ereturn list
        
        scalars:
                        e(yyc) =  223.3274512515483
                         e(yy) =  829.594812951225
                       e(Fdf2) =  424
                       e(Fdf1) =  3
                         e(Fp) =  .0000674027518006
                          e(F) =  7.493900655892364
                       e(r2_a) =  .149623860675329
                        e(mss) =  34.74939415689639
                        e(rss) =  188.5780570946519
                       e(rmse) =  .6637792834497301
                         e(r2) =  .155598400295877
                     e(center) =  0
                   e(dofminus) =  0
                  e(sdofminus) =  0
                       e(df_m) =  3
                 e(partial_ct) =  0
                   e(endog_ct) =  1
                  e(exexog_ct) =  3
                  e(inexog_ct) =  2
                          e(N) =  428
                       e(cons) =  1
                e(partialcons) =  0
                   e(nocollin) =  0
                    e(sstatdf) =  3
                     e(sstatp) =  .6033182331591862
                      e(sstat) =  1.853705929467335
                     e(ardf_r) =  422
                       e(ardf) =  3
                    e(archi2p) =  .6015863343873346
                     e(archi2) =  1.861769417806341
                       e(arfp) =  .6075953079368361
                        e(arf) =  .6118899488428938
                        e(cdf) =  4.342070862428496
                    e(widstat) =  4.342070862428496
                         e(cd) =  .0308678023395391
                        e(idp) =  .0050523097869339
                       e(iddf) =  3
                     e(idstat) =  12.81582310684222
                    e(sarganp) =  .7041553948545243
                   e(sargandf) =  2
                     e(sargan) =  .7015124317084412
                         e(jp) =  .7041553948545243
                        e(jdf) =  2
                          e(j) =  .7015124317084412
                         e(ll) =  -431.908900030797
                      e(rankV) =  4
                      e(rankS) =  6
                     e(rankxx) =  4
                     e(rankzz) =  6
                     e(condxx) =  4893604.834519811
                     e(condzz) =  8161817.434887111
                        e(r2c) =  .155598400295877
                        e(r2u) =  .7726865523377626
        Once we have these, we create new scalars to hold these values and thereafter include these in the esttab command. So in all

        Code:
        use http://fmwww.bc.edu/ec-p/data/wooldridge/mroz.dta, clear
        eststo clear
        eststo: ivreg2 lwage exper expersq (educ=age kidslt6 kidsge6), first savefirst savefprefix(s1)
        estadd scalar cdf1 =  `e(cdf)': s1educ
        estadd scalar sstat1 = `e(sstat)': s1educ
        esttab s1educ est1, stats(cdf1 sstat1, labels("CD Wald F" "SW S stat."))
        [OUTPUT AS rtf]

        Code:
        esttab s1educ est1 using myfile.rtf, stats(cdf1 sstat1, labels("CD Wald F" "SW S stat."))
        Res.:

        Code:
        . esttab s1educ est1, stats(cdf1 sstat1, labels("CD Wald F" "SW S stat."))
        
        --------------------------------------------
                              (1)             (2)   
                             educ           lwage   
        --------------------------------------------
        age               -0.0120                   
                          (-0.67)                   
        
        kidslt6             0.725*                  
                           (2.43)                   
        
        kidsge6            -0.222*                  
                          (-2.37)                   
        
        exper              0.0501          0.0422** 
                           (1.11)          (3.04)   
        
        expersq          -0.00177       -0.000832*  
                          (-1.30)         (-1.98)   
        
        educ                               0.0964   
                                           (1.18)   
        
        _cons               13.12***       -0.385   
                          (15.26)         (-0.38)   
        --------------------------------------------
        CD Wald F           4.342                   
        SW S stat.          1.854                   
        --------------------------------------------
        t statistics in parentheses
        * p<0.05, ** p<0.01, *** p<0.001
        
        ​​​​​​

        Comment


        • #5
          Dear Andrew Musau,

          Thank you for your code in #4. I have learned a lot from them. However, do you know how to report R adjusted squared of the first stage? It seems that e(r2_a) from return list in #4 is R adjusted squared of the second stage, not the first one.

          Thank you.

          Comment


          • #6
            It does not appear that the statistic is stored. You just have to quietly run the first-stage regression and save it.

            Code:
            use http://fmwww.bc.edu/ec-p/data/wooldridge/mroz.dta, clear
            eststo clear
            eststo m1: ivreg2 lwage exper expersq (educ=age kidslt6 kidsge6), first savefirst savefprefix(s1)
            estadd scalar cdf1 =  `e(cdf)': s1educ
            estadd scalar sstat1 = `e(sstat)': s1educ
            qui reg `e(instd)' `e(insts)' if e(sample)
            local r2a= e(r2_a)
            est restore m1
            estadd scalar r2a =  `r2a': s1educ
            esttab s1educ m1, stats(cdf1 sstat1 r2a, labels("CD Wald F" "SW S stat." "Adj. R2"))
            Res.:

            Code:
            . esttab s1educ m1, stats(cdf1 sstat1 r2a, labels("CD Wald F" "SW S stat." "Adj. R2"))
            
            --------------------------------------------
                                  (1)             (2)  
                                 educ           lwage  
            --------------------------------------------
            age               -0.0120                  
                              (-0.67)                  
            
            kidslt6             0.725*                  
                               (2.43)                  
            
            kidsge6            -0.222*                  
                              (-2.37)                  
            
            exper              0.0501          0.0422**
                               (1.11)          (3.04)  
            
            expersq          -0.00177       -0.000832*  
                              (-1.30)         (-1.98)  
            
            educ                               0.0964  
                                               (1.18)  
            
            _cons               13.12***       -0.385  
                              (15.26)         (-0.38)  
            --------------------------------------------
            CD Wald F           4.342                  
            SW S stat.          1.854                  
            Adj. R2            0.0233                  
            --------------------------------------------
            t statistics in parentheses
            * p<0.05, ** p<0.01, *** p<0.001
            where

            Code:
            . reg `e(instd)' `e(insts)' if e(sample)
            
                  Source |       SS           df       MS      Number of obs   =       428
            -------------+----------------------------------   F(5, 422)       =      3.04
                   Model |  77.4310057         5  15.4862011   Prob > F        =    0.0105
                Residual |  2152.76526       422  5.10133947   R-squared       =    0.0347
            -------------+----------------------------------   Adj R-squared   =    0.0233
                   Total |  2230.19626       427  5.22294206   Root MSE        =    2.2586
            
            ------------------------------------------------------------------------------
                    educ |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
                     age |   -.012015   .0178791    -0.67   0.502    -.0471582    .0231281
                 kidslt6 |   .7254425   .2979333     2.43   0.015     .1398244    1.311061
                 kidsge6 |  -.2219447    .093675    -2.37   0.018    -.4060725   -.0378169
                   exper |   .0501414   .0452026     1.11   0.268    -.0387088    .1389916
                 expersq |  -.0017657   .0013568    -1.30   0.194    -.0044326    .0009012
                   _cons |   13.12194   .8598407    15.26   0.000     11.43184    14.81204
            ------------------------------------------------------------------------------

            Comment


            • #7
              Excellent code as always. Many thanks Andrew Musau

              Comment


              • #8
                Dear Andrew Musau,

                Sorry for bothering you again but do you know there is any option in -outreg2- command (SSC) that is similar to onecell option in -esttab- command? I know that the onecell option in -esstab- can put coefficients and their associated standard errors in one row but I have found a similar one in -outreg2-. That would make me transfer estimates from files to files faster.

                Thank you.

                Comment


                • #9
                  I think the -sideway- option of outreg2 does that.

                  Comment

                  Working...
                  X