xtheckman in Panel Data error

Facundo Duran

Join Date: Sep 2023
Posts: 28

xtheckman in Panel Data error

15 Sep 2023, 12:18

Hello! I am trying to use the xtheckman command with a data panel, specifically the panel has information from 10 individuals over 26 years, from 1996 to 2021, in addition the information is monthly, so I have 26*12=312 observations for each individual.
My data

Code:

tiempo id_trabajador rem_tot sexo tenure edad desempleo
"1996-01-01" 5929 673.13074 1  1 34 .171
"1996-02-01" 5929 700.77625 1  2 34 .171
"1996-03-01" 5929 704.61829 1  3 34 .171
"1996-04-01" 5929 738.87048 1  4 34 .171
"1996-05-01" 5929 668.61676 1  5 34 .171
"1996-06-01" 5929 1130.1899 1  6 34 .171
"1996-07-01" 5929 693.80298 1  7 34 .173
"1996-08-01" 5929 532.82471 1  8 34 .173
"1996-09-01" 5929 442.36554 1  9 34 .173
"1996-10-01" 5929 475.64505 1 10 34 .173
"1996-11-01" 5929 535.51196 1 11 34 .173
"1996-12-01" 5929 632.32269 1 12 34 .173
"1997-01-01" 5929         . 1  . 35 .148
"1997-02-01" 5929         . 1  . 35 .148
"1997-03-01" 5929         . 1  . 35 .148
"1997-04-01" 5929 258.59531 1  1 35 .148
"1997-05-01" 5929 320.48413 1  2 35 .148
"1997-06-01" 5929 399.21515 1  3 35 .148
"1997-07-01" 5929         . 1  . 35 .148
"1997-08-01" 5929         . 1  . 35 .148
"1997-09-01" 5929         . 1  . 35 .148
"1997-10-01" 5929         . 1  . 35 .148
"1997-11-01" 5929         . 1  . 35 .148
"1997-12-01" 5929         . 1  . 35 .148
"1998-01-01" 5929         . 1  . 36 .133
"1998-02-01" 5929         . 1  . 36 .133
"1998-03-01" 5929         . 1  . 36 .133
"1998-04-01" 5929         . 1  . 36 .133
"1998-05-01" 5929         . 1  . 36 .133
"1998-06-01" 5929         . 1  . 36 .133
"1998-07-01" 5929         . 1  . 36 .125
"1998-08-01" 5929         . 1  . 36 .125
"1998-09-01" 5929         . 1  . 36 .125
"1998-10-01" 5929         . 1  . 36 .125
"1998-11-01" 5929         . 1  . 36 .125
"1998-12-01" 5929         . 1  . 36 .125
"1999-01-01" 5929         . 1  . 37 .146
"1999-02-01" 5929         . 1  . 37 .146
"1999-03-01" 5929         . 1  . 37 .146
"1999-04-01" 5929         . 1  . 37 .146
"1999-05-01" 5929         . 1  . 37 .146
"1999-06-01" 5929         . 1  . 37 .146
"1999-07-01" 5929         . 1  . 37 .139
"1999-08-01" 5929         . 1  . 37 .139
"1999-09-01" 5929         . 1  . 37 .139
"1999-10-01" 5929         . 1  . 37 .139
"1999-11-01" 5929         . 1  . 37 .139
"1999-12-01" 5929         . 1  . 37 .139
"2000-01-01" 5929         . 1  . 38 .154
"2000-02-01" 5929         . 1  . 38 .154
"2000-03-01" 5929         . 1  . 38 .154
"2000-04-01" 5929         . 1  . 38 .154
"2000-05-01" 5929         . 1  . 38 .154
"2000-06-01" 5929         . 1  . 38 .154
"2000-07-01" 5929         . 1  . 38 .148
"2000-08-01" 5929         . 1  . 38 .148
"2000-09-01" 5929         . 1  . 38 .148
"2000-10-01" 5929         . 1  . 38 .148
"2000-11-01" 5929         . 1  . 38 .148
"2000-12-01" 5929         . 1  . 38 .148
"2001-01-01" 5929         . 1  . 39 .164
"2001-02-01" 5929         . 1  . 39 .164
"2001-03-01" 5929         . 1  . 39 .164
"2001-04-01" 5929         . 1  . 39 .164
"2001-05-01" 5929         . 1  . 39 .164
"2001-06-01" 5929         . 1  . 39 .164
"2001-07-01" 5929         . 1  . 39 .184
"2001-08-01" 5929         . 1  . 39 .184
"2001-09-01" 5929         . 1  . 39 .184
"2001-10-01" 5929         . 1  . 39 .184
"2001-11-01" 5929         . 1  . 39 .184
"2001-12-01" 5929         . 1  . 39 .184
"2002-01-01" 5929         . 1  . 40 .215
"2002-02-01" 5929         . 1  . 40 .215
"2002-03-01" 5929         . 1  . 40 .215
"2002-04-01" 5929         . 1  . 40 .215
"2002-05-01" 5929         . 1  . 40 .215
"2002-06-01" 5929         . 1  . 40 .215
"2002-07-01" 5929         . 1  . 40 .179
"2002-08-01" 5929         . 1  . 40 .179
"2002-09-01" 5929         . 1  . 40 .179
"2002-10-01" 5929         . 1  . 40 .179
"2002-11-01" 5929         . 1  . 40 .179
"2002-12-01" 5929         . 1  . 40 .179
"2003-01-01" 5929         . 1  . 41 .161
"2003-02-01" 5929         . 1  . 41 .161
"2003-03-01" 5929         . 1  . 41 .161
"2003-04-01" 5929         . 1  . 41 .161
"2003-05-01" 5929         . 1  . 41 .161
"2003-06-01" 5929         . 1  . 41 .161
"2003-07-01" 5929         . 1  . 41 .144
"2003-08-01" 5929         . 1  . 41 .144
"2003-09-01" 5929         . 1  . 41 .144
"2003-10-01" 5929         . 1  . 41 .144
"2003-11-01" 5929         . 1  . 41 .144
"2003-12-01" 5929         . 1  . 41 .144
"2004-01-01" 5929         . 1  . 42 .143
"2004-02-01" 5929         . 1  . 42 .143
"2004-03-01" 5929         . 1  . 42 .143
"2004-04-01" 5929         . 1  . 42 .147
end

I use the following command to perform the heckman

Code:

 xtheckman rem_tot c.edad##c.edad tenure, select(working = c.edad##c.edad desempleo)

But when I run the command I get the following error

Code:

 initial values not feasible
r(1400);

end of do-file

I appreciate any help since this work is for my doctoral thesis and I am short on time.
Thank you so much

Tags: None

FernandoRios

Join Date: Apr 2014

Posts: 2470
#2

15 Sep 2023, 12:41

if your N=10, i would say Panel heckman is not the way to go. You will need a substantially larger sample for xtheckman to work.
F
1 like
Comment
Facundo Duran

Join Date: Sep 2023

Posts: 28
#3

18 Sep 2023, 06:08

Originally posted by FernandoRios View Post

if your N=10, i would say Panel heckman is not the way to go. You will need a substantially larger sample for xtheckman to work.
F

Thank you very much, looking at it carefully I would like to make fixed effects, using the xtheckmanfe command, I think you designed that command. In that case also the problem is that I would need a larger N? Actually, what I showed is just a sample, my real base has an N greater than 6000. If I use a larger N then could it work?

Last edited by Facundo Duran; 18 Sep 2023, 06:15.
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2470
#4

18 Sep 2023, 06:25

I worked on a different approach. xtheckmanfe.
It applies a kind of correlated random effects model to the analysis.
Now other two points
xtheckman (official command) is a full MLE program. It could be very hard to converge, because of the model complexity.
xtheckmanfe is an implementation of Wooldridge work. Look at the helpfile to get the references. It uses a two step approach. And each year is treated separately for the probit estimation.
Wheter or not it fits your needs may depend on the assumptions, data structure, etc.
HTH
Comment
Facundo Duran

Join Date: Sep 2023

Posts: 28
#5

18 Sep 2023, 10:35

Originally posted by FernandoRios View Post

I worked on a different approach. xtheckmanfe.
It applies a kind of correlated random effects model to the analysis.
Now other two points
xtheckman (official command) is a full MLE program. It could be very hard to converge, because of the model complexity.
xtheckmanfe is an implementation of Wooldridge work. Look at the helpfile to get the references. It uses a two step approach. And each year is treated separately for the probit estimation.
Wheter or not it fits your needs may depend on the assumptions, data structure, etc.
HTH

Thank you very much for responding and sorry for the inconvenience, I expanded the number of individuals to 600 but I cannot estimate it.
The model considers income as a function of age age^2 and experience. While the selection equation is a dummy for the condition of working based on age, age^2 and unemployment (it would be like the market variable in the example)
When i try to execute de code

Code:

xtset id_trabajador fecha xtheckmanfe rem_tot c.edad##c.edad tenure, select(working = c.edad##c.edad desempleo)

I don't get any results

Code:

. xtset id_trabajador fecha panel variable: id_trabajador (strongly balanced) time variable: fecha, 01jan1996 to 01dec2021, but with gaps delta: 1 day . xtheckmanfe rem_tot c.edad##c.edad tenure, select(working = c.edad##c.edad desempleo) r(2000); end of do-file r(2000);
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2470
#6

19 Sep 2023, 02:39

Start by estimating a probit model for selection in each year
see if that works
then you can do the second step
Comment
Facundo Duran

Join Date: Sep 2023

Posts: 28
#7

04 Oct 2023, 09:03

Originally posted by FernandoRios View Post

Start by estimating a probit model for selection in each year
see if that works
then you can do the second step

Thank you very much for your reply!
I am trying to do it manually as you mentioned but I have some doubts about how to calculate the inverse of the mill ratio, here is the code I was writing.

Code:

if year==1996 { xtprobit working edad_total edad_total2 desempleo predict working_index, xb gen imr = . // Crear variable IMR inicialmente como missing replace imr = (normalden(working_index)) / (normal(working_index)) if year ==1996 } if year==1996 xtreg rem_tot edad_total edad_total2 tenure imr, fe

When working edad_total edad_total2 desempleo is my selection equation

I have been reading that the IMR can be calculated as follows:

H(z) = f(z) / (1 − F(z)) (where f() is the density and F() is the cumulative, evaluated at each point of the sample and z is the forecast in probit index)
So
IMR=1/H(z)

Is this calculation correct when applied to panel data?

if this is correct then should I run this command until 2021 which is my last year in the sample?

Something I had to do to be able to use age and age squared and not have them disappear due to fixed effects, since for each year the age is the same, I expressed the age in decimals, e.g. in January 1996 the age is 30, in February it is 30.1 and so on.

Then I would like to know if the inverse of the mill ratio is correctly calculated and if the steps I followed are correct.
Thank you very much
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2470
#8

04 Oct 2023, 11:39

ok, i see the problem you are having.
When I say manually estimate the probit model I mean the following:

Code:

webuse wagework xtset personid year foreach i in age tenure market { bysort personid:egen m_`i'=mean(`i') } probit working age market m_* if year==2013 probit working age market m_* if year==2014 probit working age market m_* if year==2013

If any of the probits fails here will also fail in your overall model
F
Comment
Facundo Duran

Join Date: Sep 2023

Posts: 28
#9

04 Oct 2023, 12:23

Originally posted by FernandoRios View Post

ok, i see the problem you are having.
When I say manually estimate the probit model I mean the following:

Code:

webuse wagework xtset personid year foreach i in age tenure market { bysort personid:egen m_`i'=mean(`i') } probit working age market m_* if year==2013 probit working age market m_* if year==2014 probit working age market m_* if year==2013

If any of the probits fails here will also fail in your overall model
F

ah, now I understand
I have another doubt, because my data are monthly, so if I take a year, I have information of 12 periods for each individual, so it would also be a panel of data, in that case, shouldn't I do xtprobit instead of probit?
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2470
#10

04 Oct 2023, 13:09

Nop, if you have monthly data, your probit has to be done by month.
So instead of estimating a xtprobit, what the command does is estimating a probit for each period.
Then collects the IMR , and uses them to correct for selection.
It may be a good idea if you try to replicate this using toy datasets like wagework, following the strategy described in the references in helpfile (Wooldridge earlier paper is the most relevant)
F
Comment
Facundo Duran

Join Date: Sep 2023

Posts: 28
#11

04 Oct 2023, 15:56

Originally posted by FernandoRios View Post

Nop, if you have monthly data, your probit has to be done by month.
So instead of estimating a xtprobit, what the command does is estimating a probit for each period.
Then collects the IMR , and uses them to correct for selection.
It may be a good idea if you try to replicate this using toy datasets like wagework, following the strategy described in the references in helpfile (Wooldridge earlier paper is the most relevant)
F

Thank you very much,
do you mean this paper? Wooldridge, Jeffrey M. 1995. "Selection corrections for panel data models under conditional mean independence assumptions."

I am trying as you mentioned to replicate the "wagework" toy database. In that sense I am doing the xtheckmanfe example and looking at the IMR value.

Code:

xtset personid year xtheckmanfe wage age tenure, select(working = age market)

Subsequently I am trying to calculate it manually to compare them,

Code:

probit working age if year == 2013 predict probit_resid_2013, xb probit working age if year == 2014 predict probit_resid_2014, xb probit working age if year == 2015 predict probit_resid_2015, xb probit working age if year == 2016 predict probit_resid_2016, xb gen imr = . // Crear variable IMR inicialmente como missing replace imr = (normalden(probit_resid_2013)/normal(probit_resid_2013)) if year ==2013 replace imr = (normalden(probit_resid_2014)/normal(probit_resid_2014)) if year ==2014 replace imr = (normalden(probit_resid_2015)/normal(probit_resid_2015)) if year ==2015 replace imr = (normalden(probit_resid_2016)/normal(probit_resid_2016)) if year ==2016

but I am getting different values.
Comment

Facundo Duran

Join Date: Sep 2023
Posts: 28

#12

10 Oct 2023, 16:03

Originally posted by FernandoRios View Post

ok, i see the problem you are having.
When I say manually estimate the probit model I mean the following:

Code:

webuse wagework
xtset personid year
foreach i in age tenure market {
bysort personid:egen m_`i'=mean(`i')
}
probit working age market m_* if year==2013
probit working age market m_* if year==2014
probit working age market m_* if year==2013

If any of the probits fails here will also fail in your overall model
F

Sorry for continuing to ask you, I just did what you told me about running the probits and I see that there is no problem in any of them. However, when I run xtheckman it still does not converge.

This is the code

Code:

xtset id_trabajador fecha, monthly
foreach i in edad tenure desempleo {
bysort id_trabajador:egen m_`i'=mean(`i')
}
probit working edad desempleo m_* if mes==1
probit working edad desempleo m_* if mes==2
probit working edad desempleo m_* if mes==3
probit working edad desempleo m_* if mes==4
probit working edad desempleo m_* if mes==5
probit working edad desempleo m_* if mes==6
probit working edad desempleo m_* if mes==7
probit working edad desempleo m_* if mes==8
probit working edad desempleo m_* if mes==9
probit working edad desempleo m_* if mes==10
probit working edad desempleo m_* if mes==11
probit working edad desempleo m_* if mes==12

These are the results of the grobits

Click image for larger version

Name: probit.png
Views: 1
Size: 207.8 KB
ID: 1729757

greetings and thank you very much!

Comment

FernandoRios

Join Date: Apr 2014

Posts: 2470
#13

10 Oct 2023, 18:12

I think i kept reading your post incorrectly
The procedure I suggested is for what xtheckmanfe does. Not what xtheckman does.
The latter relies of Full information ML. Which is far more difficult to estimate.
The former is what I implemented.
Different strategies.
F
Comment
Facundo Duran

Join Date: Sep 2023

Posts: 28
#14

11 Oct 2023, 05:44

Originally posted by FernandoRios View Post

I think i kept reading your post incorrectly
The procedure I suggested is for what xtheckmanfe does. Not what xtheckman does.
The latter relies of Full information ML. Which is far more difficult to estimate.
The former is what I implemented.
Different strategies.
F

No, it's okay, what I want to do is xtheckmanfe
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2470
#15

11 Oct 2023, 06:49

In that case there is something else with your data
I can’t say much at this point
but open then the ado file and try to replicate it yourself
only that way you can figure out what is going on
it may be something that is specific to your data
Comment

Announcement

xtheckman in Panel Data error

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment