Selection in a Multinomial Logit

Alina Faruk

Join Date: Oct 2018

Posts: 96
#1

Selection in a Multinomial Logit

03 Jun 2019, 07:13

Hi,

Does anyone know if there is any way to treat self selection in a Multinomial Logit? As far as my understanding goes, we can't use Heckman with MNL.

If there is no such way, how about combining Heckman with mprobit? I have not seen mprobit being used much in literature. Other than the convergence issue, is there any other argument against using it instead of MNL?

Thanks in advance.
Tags: mlogit, mprobit, multinomial logit, selection, self selection
FernandoRios

Join Date: Apr 2014

Posts: 2479
#2

03 Jun 2019, 07:42

Hi Alina,
I dont think i have seen anywork with Mlogit combined with selection. Technically speaking, it is possible using a kind of copula approach, but i think it can be quite complicate it to program.
Combining heckman with mprobit is slighly easier because the normality assumption is valid in both models. I think you could do that using the user written command cmp or Stata's official command gsem.

HTH
Fernando
1 like
Comment
Alina Faruk

Join Date: Oct 2018

Posts: 96
#3

03 Jun 2019, 07:59

Originally posted by FernandoRios View Post

Hi Alina,
I dont think i have seen anywork with Mlogit combined with selection. Technically speaking, it is possible using a kind of copula approach, but i think it can be quite complicate it to program.
Combining heckman with mprobit is slighly easier because the normality assumption is valid in both models. I think you could do that using the user written command cmp or Stata's official command gsem.

HTH
Fernando

Thanks a lot for your reply.

If I do proceed with mprobit, is there any theoretical pitfall I should be aware of? Given there isn't much work in the literature using mprobit instead of MNL. Just wondering if it would be econometrically sound.
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2479
#4

03 Jun 2019, 08:08

I do not think there is any. At least when i get ask that, my opinion is that unless you know the underlying distribution of the error that affects the latent variables, either mprobit or mlogit will provide similar results, and have similar problems.
But perhaps there is someone else in the forum who may know better.
1 like
Comment
Alina Faruk

Join Date: Oct 2018

Posts: 96
#5

03 Jun 2019, 08:14

Originally posted by FernandoRios View Post

I do not think there is any. At least when i get ask that, my opinion is that unless you know the underlying distribution of the error that affects the latent variables, either mprobit or mlogit will provide similar results, and have similar problems.
But perhaps there is someone else in the forum who may know better.

Thanks once again for sharing your thoughts on this with me.
Comment
Alina Faruk

Join Date: Oct 2018

Posts: 96
#6

03 Jun 2019, 09:52

Originally posted by FernandoRios View Post

I do not think there is any. At least when i get ask that, my opinion is that unless you know the underlying distribution of the error that affects the latent variables, either mprobit or mlogit will provide similar results, and have similar problems.
But perhaps there is someone else in the forum who may know better.

Hi, sorry for disturbing you again but would you please tell me if these commands are okay? This is the first time I'm using cmp. My lfcategory variable has four choices, and inlf is a dummy for participation in the labour force.

cmp (lfcategory= edu age) (inlf= married kidsunder6 edu age), ///
ind(inlf*$cmp_asmprobit $cmp_probit) qui

Last edited by Alina Faruk; 03 Jun 2019, 09:59.
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2479
#7

03 Jun 2019, 10:00

I think the only change should be $cmp_asmprobit with $cmp_mprobit.
That being said. I have never used cmp in a framework like the one you are proposing. Perhaps it would be wise to run a couple of tests using simulated data, to be sure it works the way you want it to work.
Fernando
1 like
Comment
Alina Faruk

Join Date: Oct 2018

Posts: 96
#8

03 Jun 2019, 10:27

Originally posted by FernandoRios View Post

I think the only change should be $cmp_asmprobit with $cmp_mprobit.
That being said. I have never used cmp in a framework like the one you are proposing. Perhaps it would be wise to run a couple of tests using simulated data, to be sure it works the way you want it to work.
Fernando

Thanks Fernando.

I utilized the example from slide 39 here: http://fmwww.bc.edu/EC-C/S2016/8823/...n14.slides.pdf
Comment
Alina Faruk

Join Date: Oct 2018

Posts: 96
#9

04 Jun 2019, 08:33

Originally posted by FernandoRios View Post

I think the only change should be $cmp_asmprobit with $cmp_mprobit.
That being said. I have never used cmp in a framework like the one you are proposing. Perhaps it would be wise to run a couple of tests using simulated data, to be sure it works the way you want it to work.
Fernando

Dear Fernando,

The cmp is taking way too long. I have around 500,000 observations.

Can I instead estimate this in the following way:

Code:

probit inlf married kidsunder6 edu age predict xb if e(sample), xb generate mills=normalden(-xb)/(1-normal(-xb)) mprobit lfcategory edu age mills
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2479
#10

04 Jun 2019, 09:33

Hi Alina,
Just two comments on your current problem
1. How many categories do you have in lfcategory? can you group them so you have fewer categories?
2. Your proposed strategy may be helpful for exploratory analysis, and to see if you do have a selection problem, but i dont think its generally accepted. Even for a simple heckman-probit model, i have never read anything about a two step approach as you describe.
Also, what is your research question?
Fernando
1 like
Comment
Alina Faruk

Join Date: Oct 2018

Posts: 96
#11

04 Jun 2019, 09:45

Originally posted by FernandoRios View Post

Hi Alina,
Just two comments on your current problem
1. How many categories do you have in lfcategory? can you group them so you have fewer categories?
2. Your proposed strategy may be helpful for exploratory analysis, and to see if you do have a selection problem, but i dont think its generally accepted. Even for a simple heckman-probit model, i have never read anything about a two step approach as you describe.
Also, what is your research question?
Fernando

1. I have managed to reduce it to three from four, but any further aggregation would be unwise.

2. Thank you for clearing that up. I was also dubious about that approach. I might then let the cmp do it's work.

My question was finding out socioeconomic determinants affecting the choice of different labor force categories.
Comment

Announcement

Selection in a Multinomial Logit

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment