Help interpreting -etregress- output

Samarth Sharma

Join Date: Jan 2017
Posts: 2

Help interpreting -etregress- output

29 Jan 2017, 21:05

Hi Statalist,

I'm trying to use -etregress- to specify an endogenous treatment model, and have a question about how to interpret the different "panels" of the Stata output that are generated when doing this. Specifically, I'm confused about how the "second" panel of the main output table (which appears to show results from modeling selection into the treatment) differs from the first-stage probit results that are displayed when the -first- option is specified (which also appear to show the same thing).

The following example illustrates this more clearly.

Code:

webuse union3, clear
etregress wage age grade smsa black tenure, treat(union = south black tenure) first

produces two distinct output tables:

TABLE (A)

Code:

Probit regression                               Number of obs     =      1,210
                                                LR chi2(3)        =      56.54
                                                Prob > chi2       =     0.0000
Log likelihood = -592.15536                     Pseudo R2         =     0.0456

------------------------------------------------------------------------------
       union |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       south |  -.4895032   .0933276    -5.24   0.000    -.6724221   -.3065844
       black |   .4397974   .0972261     4.52   0.000     .2492377    .6303572
      tenure |   .0997638   .0236575     4.22   0.000      .053396    .1461317
       _cons |  -.9679795   .0746464   -12.97   0.000    -1.114284   -.8216753
------------------------------------------------------------------------------

and TABLE (B)

Code:

Iteration 0:   log likelihood =  -3140.811  
Iteration 1:   log likelihood = -3053.6629  
Iteration 2:   log likelihood = -3051.5847  
Iteration 3:   log likelihood =  -3051.575  
Iteration 4:   log likelihood =  -3051.575  

Linear regression with endogenous treatment     Number of obs     =      1,210
Estimator: maximum likelihood                   Wald chi2(6)      =     681.89
Log likelihood =  -3051.575                     Prob > chi2       =     0.0000

------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
wage         |
         age |   .1487409   .0193291     7.70   0.000     .1108566    .1866252
       grade |   .4205658   .0293577    14.33   0.000     .3630258    .4781058
        smsa |   .9117044   .1249041     7.30   0.000     .6668969    1.156512
       black |  -.7882471   .1367078    -5.77   0.000     -1.05619   -.5203048
      tenure |   .1524015   .0369596     4.12   0.000     .0799621    .2248409
     1.union |   2.945815   .2749621    10.71   0.000       2.4069    3.484731
       _cons |  -4.351572   .5283952    -8.24   0.000    -5.387208   -3.315936
-------------+----------------------------------------------------------------
union        |
       south |  -.5807419   .0851111    -6.82   0.000    -.7475566   -.4139271
       black |   .4557499   .0958042     4.76   0.000     .2679771    .6435226
      tenure |   .0871536   .0232483     3.75   0.000     .0415878    .1327195
       _cons |  -.8855758   .0724506   -12.22   0.000    -1.027576   -.7435753
-------------+----------------------------------------------------------------
     /athrho |  -.6544347   .0910314    -7.19   0.000     -.832853   -.4760164
    /lnsigma |   .7026769   .0293372    23.95   0.000      .645177    .7601767
-------------+----------------------------------------------------------------
         rho |  -.5746478    .060971                      -.682005   -.4430476
       sigma |   2.019151   .0592362                      1.906325    2.138654
      lambda |    -1.1603   .1495097                     -1.453334   -.8672668
------------------------------------------------------------------------------
LR test of indep. eqns. (rho = 0):   chi2(1) =    19.84   Prob > chi2 = 0.0000

My question is: how and why is TABLE (A) different from the second panel of TABLE (B), which is highlighted in red text?

Many thanks for your insights!

Tags: None

Amy Dillon

Join Date: Jun 2025

Posts: 4
#2

30 Jan 2017, 13:29

Please note - I am no expert - but from what I understand of the manual (http://www.stata.com/manuals13/te.pdf , page 29) probit is used to estimate hazard ratios (h) which then augment the OLS regression to get consistent estimates, so your table B is the linear outcome including h?

Last edited by Amy Dillon; 30 Jan 2017, 13:41. Reason: include reference
Comment
Samarth Sharma

Join Date: Jan 2017

Posts: 2
#3

30 Jan 2017, 15:01

Originally posted by Amy Dillon View Post

Please note - I am no expert - but from what I understand of the manual (http://www.stata.com/manuals13/te.pdf , page 29) probit is used to estimate hazard ratios (h) which then augment the OLS regression to get consistent estimates, so your table B is the linear outcome including h?

My understanding of the endogenous treatment regression model is that the first stage probit model is used to calculate the inverse Mills ratio (often represented by λ, and shown as -lambda- in TABLE B), which is then used as additional regressor in the second-stage OLS to provide adjusted estimates of impact of the treatment on the outcome of interest.

It's not clear to me, then, how to interpret the estimates in the red highlighted panel of TABLE B — they're similar in magnitude and sign to the probit results in TABLE A, but they're of course not the results generated by the -first- option. And if they're not exactly equal to the TABLE A probit results, which are used to calculate the the inverse Mills ratio to begin with, why are they different?
Comment

Knut Werdin

Join Date: Jul 2017
Posts: 1

16 Jul 2017, 05:39

Hi Samarth,

if you type the following:

Code:

. etregress wage age grade smsa black tenure, treat(union = south black tenure) twostep

Then you get this:

Code:

Linear regression with endogenous treatment     Number of obs      =      1210
Estimator: two-step                             Wald chi2(8)       =    566.56
                                                Prob > chi2        =    0.0000

------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
wage         |
         age |   .1543231   .0194903     7.92   0.000     .1161227    .1925234
       grade |   .4225025    .029014    14.56   0.000     .3656362    .4793689
        smsa |   .8628628   .1285907     6.71   0.000     .6108297    1.114896
       black |  -.9206944   .1774617    -5.19   0.000    -1.268513    -.572876
      tenure |   .1003226    .051879     1.93   0.053    -.0013584    .2020037
       union |   4.563859   1.006459     4.53   0.000     2.591236    6.536483
       _cons |  -4.670352   .5401517    -8.65   0.000     -5.72903   -3.611674
-------------+----------------------------------------------------------------
union        |
       south |  -.4895032   .0933276    -5.24   0.000    -.6724221   -.3065844
       black |   .4397974   .0972261     4.52   0.000     .2492377    .6303572
      tenure |   .0997638   .0236575     4.22   0.000      .053396    .1461317
       _cons |  -.9679795   .0746464   -12.97   0.000    -1.114284   -.8216753
-------------+----------------------------------------------------------------
hazard       |
      lambda |  -2.093313   .5801968    -3.61   0.000    -3.230478   -.9561486
-------------+----------------------------------------------------------------
         rho |   -0.89172
       sigma |  2.3475104
------------------------------------------------------------------------------

Thus here you can see the first-stage probit results. The only thing I have changed was that I used the option "twostep".

For me it is now the question what does the red part in your first post show? I did not find an answer in the manual: http://www.stata.com/manuals/teetregress.pdf
The only thing I have understood is that I am using the two-step consistent estimates and you, Samarth, are using the maximum likelihood estimates.

But isn't it surprising that also the coefficient for union (marked in blue) changes dramatically compared to the maximum likelihood estimate?

It would be very helpful, if there is anyboy out there you can explain this to us.

Thanks a lot.

Knut

Comment

Sabrina Muller

Join Date: Jan 2019

Posts: 21
#5

04 Mar 2019, 13:45

Did any of you figure out the solution to this or does anybody else know help?
Comment

Charles Lindsey (StataCorp)

StataCorp Employee

Join Date: Mar 2014
Posts: 5

05 Mar 2019, 11:40

-etregress- uses maximum likelihood, control function, or two-step estimators to estimate the parameters of a linear regression with endogenous treatment effects. The -first- option shows the probit coefficients for the treatment model used by the two-step estimator. These coefficients are also used as initial values for the maximum likelihood and control function estimators.

As Samarth discovered, the probit model coefficients from -first- can differ from the final treatment model coefficients in maximum likelihood estimation. Also, the maximum likelihood estimates for the main equation coefficients can differ from the two-step estimates. In methods and formulas of the manual entry, http://www.stata.com/manuals/teetregress.pdf, we show how the two-step estimates are constructed and the form of the likelihood that the maximum-likelihood estimator maximizes.

The two-step estimator obtains probit estimates in a first step and then uses these in a second step. Thus, the probit point estimates are exactly what you see in the -etregress- selection equation in the output. The likelihood based estimator has a different criterion. The criterion incorporates correlation and variance components between the linear and selection equations. These components render the selection coefficients different than the first step probit, which does not take correlation into account.

We can see this phenomenon in other commands as well. -heckman- allows maximum likelihood estimation and two-step estimation. Let me show you an example where the two estimates differ. We load the wage data from -heckman- and then use the maximum likelihood estimator.

Code:

. webuse womenwk

. heckman wage educ age, select(married children educ age)


Iteration 0:   log likelihood = -5178.7009  
Iteration 1:   log likelihood = -5178.3049  
Iteration 2:   log likelihood = -5178.3045  

Heckman selection model                         Number of obs     =      2,000
(regression model with sample selection)              Selected    =      1,343
                                                      Nonselected =        657

                                                Wald chi2(2)      =     508.44
Log likelihood = -5178.304                      Prob > chi2       =     0.0000

------------------------------------------------------------------------------
        wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
wage         |
   education |   .9899537   .0532565    18.59   0.000     .8855729    1.094334
         age |   .2131294   .0206031    10.34   0.000     .1727481    .2535108
       _cons |   .4857752   1.077037     0.45   0.652    -1.625179     2.59673
-------------+----------------------------------------------------------------
select       |
     married |   .4451721   .0673954     6.61   0.000     .3130794    .5772647
    children |   .4387068   .0277828    15.79   0.000     .3842534    .4931601
   education |   .0557318   .0107349     5.19   0.000     .0346917    .0767718
         age |   .0365098   .0041533     8.79   0.000     .0283694    .0446502
       _cons |  -2.491015   .1893402   -13.16   0.000    -2.862115   -2.119915
-------------+----------------------------------------------------------------
     /athrho |   .8742086   .1014225     8.62   0.000     .6754241    1.072993
    /lnsigma |   1.792559    .027598    64.95   0.000     1.738468     1.84665
-------------+----------------------------------------------------------------
         rho |   .7035061   .0512264                      .5885365    .7905862
       sigma |   6.004797   .1657202                       5.68862    6.338548
      lambda |   4.224412   .3992265                      3.441942    5.006881
------------------------------------------------------------------------------
LR test of indep. eqns. (rho = 0):   chi2(1) =    61.20   Prob > chi2 = 0.0000

We will see different estimates when we use the two-step estimator. We specify -first- for clarity. We will see that the first step probit estimates match the selection model coefficients for the two-step estimator, and are different than the selection model coefficients for maximum likelihood. The largest difference that we see is in the constant of the main equation.

Code:

. heckman wage educ age, select(married children educ age) twostep first


Iteration 0:   log likelihood = -1266.2225  
Iteration 1:   log likelihood = -1040.0608  
Iteration 2:   log likelihood = -1027.2398  
Iteration 3:   log likelihood = -1027.0616  
Iteration 4:   log likelihood = -1027.0616  

Probit regression                               Number of obs     =      2,000
                                                LR chi2(4)        =     478.32
                                                Prob > chi2       =     0.0000
Log likelihood = -1027.0616                     Pseudo R2         =     0.1889

------------------------------------------------------------------------------
      select |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     married |   .4308575    .074208     5.81   0.000     .2854125    .5763025
    children |   .4473249   .0287417    15.56   0.000     .3909922    .5036576
   education |   .0583645   .0109742     5.32   0.000     .0368555    .0798735
         age |   .0347211   .0042293     8.21   0.000     .0264318    .0430105
       _cons |  -2.467365   .1925635   -12.81   0.000    -2.844782   -2.089948
------------------------------------------------------------------------------

Heckman selection model -- two-step estimates   Number of obs     =      2,000
(regression model with sample selection)              Selected    =      1,343
                                                      Nonselected =        657

                                                Wald chi2(2)      =     442.54
                                                Prob > chi2       =     0.0000

------------------------------------------------------------------------------
        wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
wage         |
   education |   .9825259   .0538821    18.23   0.000     .8769189    1.088133
         age |   .2118695   .0220511     9.61   0.000     .1686502    .2550888
       _cons |   .7340391   1.248331     0.59   0.557    -1.712645    3.180723
-------------+----------------------------------------------------------------
select       |
     married |   .4308575    .074208     5.81   0.000     .2854125    .5763025
    children |   .4473249   .0287417    15.56   0.000     .3909922    .5036576
   education |   .0583645   .0109742     5.32   0.000     .0368555    .0798735
         age |   .0347211   .0042293     8.21   0.000     .0264318    .0430105
       _cons |  -2.467365   .1925635   -12.81   0.000    -2.844782   -2.089948
-------------+----------------------------------------------------------------
/mills       |
      lambda |   4.001615   .6065388     6.60   0.000     2.812821     5.19041
-------------+----------------------------------------------------------------
         rho |    0.67284
       sigma |  5.9473529
------------------------------------------------------------------------------

Comment

Sabrina Muller

Join Date: Jan 2019

Posts: 21
#7

09 Mar 2019, 00:14

Amazing, thank you so much for the clarification! I'm very inexperienced with stats, how would I decide which of the models to use? The twostep procedure gives me amazing results (in relation to my hypotheses), but I have no idea if I'm allowed to use it! Also I had to use the vce(cluster country) option in the ML, and now in twostep I'm not allowed, why is that and does it still give me reliable results? I'm using panel data with the ID being companies, but my instrument is only per country and year, which is why I switched from cluster ID to cluster GEOG in the first place in ML.
Comment

Announcement

Help interpreting -etregress- output

Comment

Comment

Comment

Comment

Comment

Comment