Hi Everyone!
I am running a heckman regression, using PSID data, as follows to generate wage margins by age and college/non-college for a sample of women
Code:
heckman lnlabinc i.year##coll i.age##coll, select(morg i.year##coll i.age##coll) twostep //ML doesn't converge
where lnlabinc is the log wage, coll is the dummy for college grads and I have year and age fixed effects. The selection variable is "morg", which takes the value 1 if the household has mortgage. I then run the following command to get the margins
code: margins coll#age, asbalanced nose
I then plot them and get the graph below.

I then run the regression on the same sample in two steps, first running a probit regression (emp is the dummy that takes the value 1 if the person is employed)
code:
probit emp i.year##coll i.age##coll i.morg
predict xb, xb
predict phat, pr
gen imr = normalden(xb)/phat
label variable imr "inverse Mills ratio"
Then a wage regression
code:
reg lnlabinc i.year##coll i.age##coll imr
Then the margins
code: margins coll#age, asbalanced nose
I plot the margins and the results are not at all comparable with what I get from a two step Heckman. The plot with probit and linear regression make more sense as they are close to what I get without controlling for imr. I do not understand why the results are so different. I would highly appreciate any help.

Thank you very much.
I am running a heckman regression, using PSID data, as follows to generate wage margins by age and college/non-college for a sample of women
Code:
heckman lnlabinc i.year##coll i.age##coll, select(morg i.year##coll i.age##coll) twostep //ML doesn't converge
where lnlabinc is the log wage, coll is the dummy for college grads and I have year and age fixed effects. The selection variable is "morg", which takes the value 1 if the household has mortgage. I then run the following command to get the margins
code: margins coll#age, asbalanced nose
I then plot them and get the graph below.
I then run the regression on the same sample in two steps, first running a probit regression (emp is the dummy that takes the value 1 if the person is employed)
code:
probit emp i.year##coll i.age##coll i.morg
predict xb, xb
predict phat, pr
gen imr = normalden(xb)/phat
label variable imr "inverse Mills ratio"
Then a wage regression
code:
reg lnlabinc i.year##coll i.age##coll imr
Then the margins
code: margins coll#age, asbalanced nose
I plot the margins and the results are not at all comparable with what I get from a two step Heckman. The plot with probit and linear regression make more sense as they are close to what I get without controlling for imr. I do not understand why the results are so different. I would highly appreciate any help.
Thank you very much.