Heckman procedure - how to assess the presence of a selection effect (endogeneity)

Riccardo Valboni

Join Date: Jun 2014

Posts: 123
#1

Heckman procedure - how to assess the presence of a selection effect (endogeneity)

21 May 2016, 09:00

I ran a two-stage Heckman procedure and I am trying to figure out whether a selection effect is taking place. In management research, this is typically done by taking the inverse Mills' ratio from the selection equation and adding it to the performance equation. If the inverse Mills' ratio is significant, then a selection effect is taking place.

The Heckman procedure in Stata seems to provide various estimates of whether a selection effect is occurring, notably athrho and lnsigma in the output at the bottom of the selection model (see http://www.stata.com/manuals13/rheckman.pdf , p. 8). As an alternative, one can generate the inverse Mills' ratio as a variable and then plug it manually in the performance equation and check whether its effect is significantly different from zero - this is what management scholars tend to do.

My question arises from the observation that the effect size and the corresponding p-value of these three indicators (athrho, lnsigma, and the beta of inverse Mills' ratio) do not match. In particular, I was unable to connect the values corresponding to athrho and lnsigma to the effect size and p-value of the inverse Mills' ratio in the performance equation. Can someone explain me the relation among the three of them and what parameter one should look at to understand if a selection effect is occurring?

To clarify what I have been doing so far, the code I used is equivalent to this:

Code:

heckman var1 var2 var3 var4 var5 c.var2#c.var3, /// select(var6 = var4 var5 var7) vce(cluster var8) mills(mills_var)

Separately, I checked the effect size and significance of the Mills' ratio in the performance equation by doing this:

Code:

reg var1 var2 var3 var4 var5 c.var2#c.var3 mills_var, vce(cluster var8)

Obtaining different results as described above.

Thank you in advance for the help!

Last edited by Riccardo Valboni; 21 May 2016, 09:08.
Tags: None

1 like
Joao Santos Silva

Join Date: Apr 2014

Posts: 3028
#2

21 May 2016, 13:25

Dear Riccardo,

This is not one of my favorite areas but since you are not getting much attention here is my contribution.

When you estimate the model by maximum likelihood, the parameters athrho and lnsigma are transformations of the parameters of the model that are used to facilitate the estimation. To test the hypothesis of no selection you can test the significance of rho (or athrho) or of lambda (which is rho*sigma).

The two-step procedure you describe is not a maximum likelihood estimator and that is why you get different results. In this case, you can test for no selection by testing the significance of the parameter associated with the ratio, which is an estimate of lambda. Notice that the two-step estimator can be implemented in Stata using the option twostep.

All the test statistics and p-values will be different but all are asymptotically valid.

Now, all of this is valid assuming that the conditions required for the validity of the estimator are met. In particular this estimator depends critically on the assumptions of normality and homoskedasticity, which personally I generally do not find credible. So, use the estimator with due care and skepticism...

Best wishes,

Joao
Comment
Riccardo Valboni

Join Date: Jun 2014

Posts: 123
#3

24 May 2016, 11:08

Dear Joao,

Many thanks for your reply. Your comment helped me understand that the default estimation of the Heckman in Stata is not carried out through a two-step process. Now I see the difference in the estimates or rho and lambda.

Best wishes,

Riccardo
Comment
Malte Hessenius

Join Date: Jul 2018

Posts: 10
#4

15 Sep 2018, 07:23

Riccardo Valboni I am not sure whether you're still active in this forum, but I am somewhat struggling with similar problem right now.

From my understanding - the presence of a selection effect is tested by the significance of the invese Mills ratio, which Stata provides when the twostep option is specified. In the maximum likelihood case, the inverse Mills ratio is not explicitly specified in the output. According to Joao Santos Silva , the lamdba (rho*sigma) does the same job. However, when I am applying the twostep option and maximum likelihood option separately, I have a selection effect in the former but none in the latter.

Does anyone of you have any advice on that? I can't really wrap my mind around it...
1 like
Comment
Jala Youssef

Join Date: Mar 2021

Posts: 43
#5

15 Jul 2021, 12:28

Dear Malte Hessenius,

I am facing a similar issue. Were you able to find out why in the maximum likelihood option the inverse Mills ratio is not explicitly specified in the output (and even the lambda significance is not specified)?

Many thanks in advance for your help.
Comment
Jala Youssef

Join Date: Mar 2021

Posts: 43
#6

18 Jul 2021, 03:18

Dear Joao Santos Silva,

Following up on your reply above, I would be grateful if you could clarify why in the case of maximum likelihood estimation the significance of the lambda coefficient is not specified in the output (we can't see the inverse mills ratio explicitly specified as a regressor in the output as in the case of the "twostep" estimation option).

Many thanks in advance for your help.
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3028
#7

19 Jul 2021, 03:56

Dear Jala Youssef,

The lambda is a product of two parameters. In the two-step procedure, that is all you identify and that is what is reported. When we use ML, both parameters are identified and each of them is reported separately. You can still get lambda as the product (and compute the standard errors with the delta method).

Best wishes,

Joao
Comment
Jala Youssef

Join Date: Mar 2021

Posts: 43
#8

20 Jul 2021, 02:58

Dear Joao Santos Silva,

Many thanks for the very useful reply, much appreciated.

Jala
Comment
FELICITY FORD

Join Date: Jan 2021

Posts: 6
#9

23 Jan 2023, 16:05

it is far better of if one will use the package '' gtsheckman ''
it describe the first stage and second stage of the Heckman model.
Comment

Announcement

Heckman procedure - how to assess the presence of a selection effect (endogeneity)

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment