heckman first step vs probit

Brian Holtemeyer

Join Date: Jun 2018

Posts: 36
#1

heckman first step vs probit

18 Jan 2023, 09:40

While running -heckman- with the -twostep- option, I was getting the "convergence not achieved" message. Curiously, when I ran the first step "manually" I didn't hit the error. A bit of investigation revealed the likely culplit to be two sets of "dummy" predictors: there was only 1 observation for one permutation of the 2 variables. But this made me realize that I don't fully understand what the -heckman- command is doing in the first step. So here's my main question: under what conditions does -heckman- not converge, while -probit- does?

Here's some data you can test with. I run the estimation using -heckman- and "manually". The first step involves 2 sets of dummy predictors (i.x1 and i.x2). The first time through the loop you can see the betas match, as I expected. The second time through the loop I create the situation I was describing above. This causes the heckman (but not the probit) to complain.

Code:

*make up some data clear set obs 1000 g x1 = int(runiform(1,5)) g x2 = int(runiform(1,5)) g d = (runiform()>0.5) g y = runiform() qui forv z = 1/2 { noi di _newline(2) "z=`z'" preserve est clear if `z'==2 replace x1 = 0 if _n==1 noi ta x1 x2, m /* when `z'==2, notice the "single" observation */ probit d i.x1 i.x2 predict probitxb, xb ge IMR = normalden(probitxb) / normal(probitxb) eststo use_command: heckman y i.x1 , sel(d= i.x1 i.x2 ) twostep first eststo do_manually: reg y i.x1 IMR if d==1 noi estout , cells( b(fmt(5)) se ) starlevels(* 0.10 *^ 0.05 *^* 0.01 ) restore }

Two minor follow up questions:
What other adjustments does -heckman- make to the standard errors that I haven't done here (clearly I'm missing something)?
Is it possible to only display the "main" estimates from the second stage of -heckman- (sometimes I don't want to see the estimates for the first stage). Probably an -estout- option?
Tags: None
Brian Holtemeyer

Join Date: Jun 2018

Posts: 36
#2

20 Jan 2023, 08:30

bump!
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2462
#3

20 Jan 2023, 09:03

Hi Brian
this is a very interesting puzzle!
If you look into the two regressions, Probit complains as well, but does it "quietly" but fixing it before you even see whats going on.
So whats happening?
1. the convergence problem has to do with perfect prediction.
whenever x1==0 it can predict failure perfectly!.
You will see this if you run the probit first. Now, it fixes it by dropping the offending observation (now you have 999 obs instead of 1000)
2. But heckman fails!
Now, probit would drop the observation, but heckman doesn't , or cant, because the outcome y is not missing when x1==0. So is trying to keep 1000 observations in the sample in the internal probit procedure.
This causes non-convergence in the first step.

Unfortunately, you would only know this if you look at the data carefully, which is why ( i would guess) heckman tw steps is having problems.

Regarding your minor questions.
SE are corrected using a "longish" formula. But you can get the exact formula used in the manual (look into heckman twostep formulas and methods)
For the esttab...estout...you can use "drop(eqname: )"

HTH
Comment
Brian Holtemeyer

Join Date: Jun 2018

Posts: 36
#4

20 Jan 2023, 09:36

Originally posted by FernandoRios View Post

Hi Brian
this is a very interesting puzzle!
If you look into the two regressions, Probit complains as well, but does it "quietly" but fixing it before you even see whats going on.
So whats happening?
1. the convergence problem has to do with perfect prediction.
whenever x1==0 it can predict failure perfectly!.
You will see this if you run the probit first. Now, it fixes it by dropping the offending observation (now you have 999 obs instead of 1000)
2. But heckman fails!
Now, probit would drop the observation, but heckman doesn't , or cant, because the outcome y is not missing when x1==0. So is trying to keep 1000 observations in the sample in the internal probit procedure.
This causes non-convergence in the first step.

Unfortunately, you would only know this if you look at the data carefully, which is why ( i would guess) heckman tw steps is having problems.

Regarding your minor questions.
SE are corrected using a "longish" formula. But you can get the exact formula used in the manual (look into heckman twostep formulas and methods)
For the esttab...estout...you can use "drop(eqname: )"
HTH

It seems like someone made a judgement: when there's perfect prediction, then -probit- should still display estimates while -heckman- should not. I'm not saying that judgement is wrong, but it's not clear to me why that's the case. Are the heckman estimates more flawed than the probit ones in cases where prefect predictors exist? Can anyone help me understand?

Thanks for the "drop(eqname: )" tip. I had been forgetting the colon.
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2462
#5

20 Jan 2023, 11:05

I think that is a question for technical services.
And who knows, perhaps a question for the original heckman programmer.
Comment
Brian Holtemeyer

Join Date: Jun 2018

Posts: 36
#6

20 Jan 2023, 11:41

Originally posted by FernandoRios View Post

I think that is a question for technical services.
And who knows, perhaps a question for the original heckman programmer.

Thanks Fernando. I'll ask them.
Comment

Announcement

heckman first step vs probit

Comment

Comment

Comment

Comment

Comment