Coefficients of multinomial logit vs seperate logistic regressions

Elif Cengen

Join Date: Jul 2019
Posts: 30

Coefficients of multinomial logit vs seperate logistic regressions

07 Sep 2021, 05:06

Dear Statalist users,

I have a question which I could not clarify in my mind: What is the relationship between the coefficients obtained in a multinomial logit and a set of independent logistic regressions?

To be more accurate, I attached a very simple example below. Is there a relationship between the coefficients from estimations #1, #2, #3 and #4?

Code:

. use https://stats.idre.ucla.edu/stat/data/hsbdemo, clear
(highschool and beyond (200 cases))

. * Prog variable has three categories
. tab prog

    type of |
    program |      Freq.     Percent        Cum.
------------+-----------------------------------
    general |         45       22.50       22.50
   academic |        105       52.50       75.00
   vocation |         50       25.00      100.00
------------+-----------------------------------
      Total |        200      100.00

.
. * Generate seperate dummies for each category of the program variable
. gen general=0

. replace general=1 if prog==1
(45 real changes made)

.
. gen academic=0

. replace academic=1 if prog==2
(105 real changes made)

.
. gen vocational=0

. replace vocational=1 if prog==3
(50 real changes made)

.
. * #1 Multinomial specification  (vocational is the reference category)
. mlogit prog write, base(3)

Iteration 0:   log likelihood = -204.09667  
Iteration 1:   log likelihood = -186.05186  
Iteration 2:   log likelihood = -185.51265  
Iteration 3:   log likelihood = -185.51084  
Iteration 4:   log likelihood = -185.51084  

Multinomial logistic regression                 Number of obs     =        200
                                                LR chi2(2)        =      37.17
                                                Prob > chi2       =     0.0000
Log likelihood = -185.51084                     Pseudo R2         =     0.0911

------------------------------------------------------------------------------
        prog |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
general      |
       write |    .051801   .0225143     2.30   0.021     .0076739    .0959282
       _cons |  -2.646504   1.127438    -2.35   0.019    -4.856241   -.4367673
-------------+----------------------------------------------------------------
academic     |
       write |   .1178089   .0216189     5.45   0.000     .0754367    .1601812
       _cons |  -5.358994   1.115266    -4.81   0.000    -7.544875   -3.173113
-------------+----------------------------------------------------------------
vocation     |  (base outcome)
------------------------------------------------------------------------------

.
. * #2 Logit specifications
. logit general write

Iteration 0:   log likelihood = -106.63277  
Iteration 1:   log likelihood = -105.96906  
Iteration 2:   log likelihood = -105.96688  
Iteration 3:   log likelihood = -105.96688  

Logistic regression                             Number of obs     =        200
                                                LR chi2(1)        =       1.33
                                                Prob > chi2       =     0.2485
Log likelihood = -105.96688                     Pseudo R2         =     0.0062

------------------------------------------------------------------------------
     general |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       write |  -.0204257   .0176417    -1.16   0.247    -.0550028    .0141514
       _cons |  -.1689862   .9291947    -0.18   0.856    -1.990174    1.652202
------------------------------------------------------------------------------

.
. * #3 Logit specifications
. logit academic write

Iteration 0:   log likelihood = -138.37933  
Iteration 1:   log likelihood = -122.55844  
Iteration 2:   log likelihood = -122.55784  
Iteration 3:   log likelihood = -122.55784  

Logistic regression                             Number of obs     =        200
                                                LR chi2(1)        =      31.64
                                                Prob > chi2       =     0.0000
Log likelihood = -122.55784                     Pseudo R2         =     0.1143

------------------------------------------------------------------------------
    academic |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       write |   .0918986   .0178276     5.15   0.000     .0569572      .12684
       _cons |  -4.754081   .9582002    -4.96   0.000    -6.632119   -2.876043
------------------------------------------------------------------------------

.
. * #4 Logit specifications
. logit vocational write

Iteration 0:   log likelihood = -112.46703  
Iteration 1:   log likelihood = -99.439698  
Iteration 2:   log likelihood = -98.987386  
Iteration 3:   log likelihood = -98.986268  
Iteration 4:   log likelihood = -98.986268  

Logistic regression                             Number of obs     =        200
                                                LR chi2(1)        =      26.96
                                                Prob > chi2       =     0.0000
Log likelihood = -98.986268                     Pseudo R2         =     0.1199

------------------------------------------------------------------------------
  vocational |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       write |  -.0927851   .0191087    -4.86   0.000    -.1302375   -.0553327
       _cons |   3.624981   .9589797     3.78   0.000     1.745415    5.504547
------------------------------------------------------------------------------

Last edited by Elif Cengen; 07 Sep 2021, 05:10.

Tags: None

Richard Williams

Join Date: Apr 2014

Posts: 4984
#2

07 Sep 2021, 07:59

The logistic regressions should be 1 versus base and 2 versus base. So the code should be more like

Code:

use https://stats.idre.ucla.edu/stat/data/hsbdemo, clear gen general = 0 if prog==3 replace general = 1 if prog == 1 gen academic = 0 if prog==3 replace academic = 1 if prog == 2 mlogit prog write, base(3) logit general write logit academic write

As a sidelight, I've never quite understood why the logistic regressions don't exactly match the mlogit results. But they are very, very close. I assume the differences reflect simultaneous estimation of all the equations at the same time.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10190

07 Sep 2021, 08:03

In binary logit, $p_{i} = Pr[y_{i}=1|x_{i}]$ is the probability of subject $i$ making a positive response. This is given by

$$p_{i}= \frac{1}{1+e^{-\beta^{\prime} x_{i}}} = \frac{e^{\beta^{\prime}x_{i}}}{1+e^{\beta^{\prime} x_{i}}}$$

In the case of $(j=1, \cdots, m)$ choices, multinomial logit just generalizes this to

$$p_{ij}= \frac{e^{\beta^{\prime}x_{ij}}}{\sum_{k=1}^{m} e^{\beta^{\prime}x_{ik}}}$$

So a probability in multinomial logit is a function of all other alternatives. It is evident that with only one alternative, the multinomial logit model collapses to binary logit, and that is why you can estimate a binary logit model with mlogit.

Code:

use https://stats.idre.ucla.edu/stat/data/hsbdemo, clear
gen general= prog==1
logit general write
mlogit general write

Res.:

Code:

. logit general write

Iteration 0:   log likelihood = -106.63277  
Iteration 1:   log likelihood = -105.96906  
Iteration 2:   log likelihood = -105.96688  
Iteration 3:   log likelihood = -105.96688  

Logistic regression                             Number of obs     =        200
                                                LR chi2(1)        =       1.33
                                                Prob > chi2       =     0.2485
Log likelihood = -105.96688                     Pseudo R2         =     0.0062

------------------------------------------------------------------------------
     general |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       write |  -.0204257   .0176417    -1.16   0.247    -.0550028    .0141514
       _cons |  -.1689862   .9291947    -0.18   0.856    -1.990174    1.652202
------------------------------------------------------------------------------

. 
. mlogit general write

Iteration 0:   log likelihood = -106.63277  
Iteration 1:   log likelihood = -105.96906  
Iteration 2:   log likelihood = -105.96688  
Iteration 3:   log likelihood = -105.96688  

Multinomial logistic regression                 Number of obs     =        200
                                                LR chi2(1)        =       1.33
                                                Prob > chi2       =     0.2485
Log likelihood = -105.96688                     Pseudo R2         =     0.0062

------------------------------------------------------------------------------
     general |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
0            |  (base outcome)
-------------+----------------------------------------------------------------
1            |
       write |  -.0204257   .0176417    -1.16   0.247    -.0550028    .0141514
       _cons |  -.1689862   .9291947    -0.18   0.856    -1.990174    1.652202
------------------------------------------------------------------------------

.

Comment

Elif Cengen

Join Date: Jul 2019

Posts: 30
#4

08 Sep 2021, 04:02

Richards, many thanks for your reply. Do you think the differences might occur due to the different number of iterations?
Comment
Elif Cengen

Join Date: Jul 2019

Posts: 30
#5

08 Sep 2021, 04:07

Andrew, many thanks for your reply. One slight difference with the specification in my question and the specification in your example: In the mlogit estimation, I want "prog" variable to be the dependent variable, not "general".

I am looking for a way to obtain the same coefficient for the write variable in the specification below

Code:

mlogit prog write, base(3)

by estimating independent logit regressions for each category like

Code:

logit general write logit academic write
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10190
#6

08 Sep 2021, 04:55

Richard Williams already showed you this in #2. The coefficients do not match exactly as they are estimated jointly in the case of multinomial logit. See the discussion in the link below under "Estimating the coefficients".

https://en.wikipedia.org/wiki/Multin...tic_regression

Last edited by Andrew Musau; 08 Sep 2021, 04:57.
Comment
Elif Cengen

Join Date: Jul 2019

Posts: 30
#7

08 Sep 2021, 07:50

Andrew Musau, do you think I can use those coefficients obtained from seperate logit resgressions as if I modeled my dependent variable with multinomial logit?
Comment
Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2402
#8

08 Sep 2021, 08:03

Why does it matter that they are slightly different? If you only care about the comparison of I've category vs the rest, use logit. Otherwise, mlogit works. Since you have the ability to do mlogit there's really no reason, on the face of it, to approximate it with a set of logistic regressions. But, you shouldn't pretend the coefficients came from one model when they did not, as that would be misleading.
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10190
#9

08 Sep 2021, 12:07

I agree with Leonardo. If you are modeling mutually exclusive choices, then stick to mlogit. The only way that you can justify using logit is if you model a choice as binary (=1 if a respondent chose a particular option and zero otherwise). In this case, you set all other categories equal to 0. Is there any reason why you want to use logit over mlogit?
Comment
Elif Cengen

Join Date: Jul 2019

Posts: 30
#10

09 Sep 2021, 02:26

Leonardo Guizzetti and Andrew Musau , the reason why I am trying to use logit over mlogit is that, my dependent variable has approximately 300 categories and Stata can not handle estimating mlogit with this dependent variable.
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10190
#11

09 Sep 2021, 03:33

Wow, 300 levels. That sounds like a continuous variable to me. If you offered someone a menu with 300 mutually exclusive items, they will most probably consider less than 10 items due to bounded rationality. Neoclassical utility maximization goes out of the window here! In any case, if you want to use mlogit in cases where you have a lot of categories, one way is to consider the most popular choices and merge all other choices into one "other" category.

Code:

tab choicevar
2 likes
Comment
Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2402
#12

09 Sep 2021, 03:35

300 is indeed a lot of outcome categories. I would consider whether you really need them all or if some can be collapsed, and whether you have enough data in each category to produce reliable estimates. If you are still outside of the limit for multinomial logistic regression, your only recourse would be to estimate a series of logistic regression models. However, it will still be challenging to present those results in a coherent manner.
Comment
Elif Cengen

Join Date: Jul 2019

Posts: 30
#13

09 Sep 2021, 03:39

Andrew Musau You are absolutely right. It is better I am more specific. The dependent variable is occupations (at 3-digit level) and I am trying to find the determinants selecting into occupations. I can aggregate the occupations to 1 or 2-digit, however I am trying to keep the occupations as detailed as possible.
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10190
#14

09 Sep 2021, 04:25

Have you tried using Stata MP?
Comment
Hong Il Yoo

Join Date: Jan 2015

Posts: 292
#15

09 Sep 2021, 05:30

RE: the question that Elif Cengen asked in post #1 (the relationship between -mlogit- of A vs B vs C and -logit- of A vs {B or C}), you will find an answer in the following paper by Cramer and Ridder:

Cramer, J. S., & Ridder, G. (1991). Pooling states in the multinomial logit model. Journal of Econometrics, 47(2-3), 267-272. https://doi.org/10.1016/0304-4076(91)90102-J

The authors also propose a test that you can apply to evaluate whether it's OK to pool B and C into the same category. You can download a community-contributed command, -crtest-, to execute the test.
2 likes
Comment

Announcement

Coefficients of multinomial logit vs seperate logistic regressions

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment