Interpreting results of Multinomial Logistic Regression- Panel data

Siege Taker

Join Date: Apr 2020

Posts: 22
#1

Interpreting results of Multinomial Logistic Regression- Panel data

12 Sep 2020, 19:31

Hi All,

Its Siege here and I really some help

I am running a Multinomial logistic regression model (mlogit) on an unbalanced Panel data. First I want to determine the impact of the explanatory variables (7 of them) at each of the 4 distress outcomes levels- NST, ST, SST and SSTDelisted. NST is the base outcome and all explanatory variables are continuous except CEO_DUAL that is binary. I am not sure if I am interpreting the outcome of the mlogit as per the attached snapshot correctly.

For instance, judging from the P value, CEO_DUAL is not significant for outcomes ST but significant for SST and SSTDelisted outcomes, is this interpretation correct?

Then, Judging from the Coefficient, although DEBTTA is significant at ST, SST and SSTDelisted, the significance is strongest at ST (4.12), seconded by SST (3.5) and then SSTDelisted (2.2)- is this interpretation correct?

Thirdly, when can I conclude that an explanatory variable does not have an impact at an outcome level? is there any other thing I need to know in interpreting these results?

Thank you

Attached Files
Tags: None
Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#2

14 Sep 2020, 12:39

Having posted 13 times, you should know to follow the FAQ on asking questions – provide Stata code in code delimiters, readable Stata output, and sample data using dataex. You can simplify the problem to the minimum necessary to identify or reveal your difficulty.

There is a lot of debate over use of significance tests, but if you're using them that's how you would interpret them yes. You need to distinguish between statistical significance and practical significance. The Z may give you an indicator of relative statistical significance, although differences once you have a P less than.001 are probably not that interesting. The parameter itself does give you a indicator of relative substantive influence. In general, I would strongly recommend using margins after this to look at differences in predicted value for different values of the right-hand side variables.
2 likes
Comment
Tom Scott

Join Date: Apr 2019

Posts: 266
#3

14 Sep 2020, 22:33

SIege, two additional things: 1) the coefficients are only giving you the change in the log odds of one outcome compared to your base outcome so be careful when making comparisons 2) you cannot conclude that a variable does or does not have an impact on an outcome, only that there is a correlation (or not). Impact implies causality.
1 like
Comment
Siege Taker

Join Date: Apr 2020

Posts: 22
#4

07 Oct 2020, 02:05

Thanks Tom and Phil

Does anyone have any insight on why an IV will be significant (P value) at one outcome level and not the other? For instance, P value for CEO_DUAL is not significant for outcomes ST but significant for SST and SSTDelisted outcomes, is this interpretation correct is there something am missing?
Comment
Siege Taker

Join Date: Apr 2020

Posts: 22
#5

07 Oct 2020, 02:06

.
Comment
Tom Scott

Join Date: Apr 2019

Posts: 266
#6

07 Oct 2020, 23:15

Siege Taker because the coefficient for CEO_DUAL is describing how a change in that variable relates to a change in the log odds of being in whichever outcome it is compared to being in your reference outcome (MST).

For example, imagine that left-handed children are more likely to play baseball than soccer, but not more likely to play football than soccer. With soccer as the reference category, the variable for left-handedness would be significant in the baseball outcome group but not the football outcome group. If you changed the reference group to football, you can see if left handed children are more likely to play baseball than football.
Comment
Siege Taker

Join Date: Apr 2020

Posts: 22
#7

08 Oct 2020, 05:35

Tom Scott, thanks and I perfectly understand your Soccer analogy and have now read some MLR literature to understand the model. I noticed these interpretations make more sense for research in social science, science or health, my research is in finance. The variable CEO_DUAL is binary, 1 for companies that CEO and Chairman are different, 0 for companies that it is same person. The only way I can interpret the CEO_DUAL result is this:-
"There is no difference between CEO_DUAL practice for ST companies compared to NST companies (reference group) however, there is a difference between CEO_DUAL practice for SST companies and NST companies, indeed SST Companies are 5.4 times less likely to practice CEO_DUAL compared to NST companies".
Is this how you would interpret it?
Comment
Tom Scott

Join Date: Apr 2019

Posts: 266
#8

09 Oct 2020, 09:11

Siege Taker you are testing the hypothesis that companies where the CEO and Chairman are different (CEO_DUAL = 1) are more likely to be in one category than another (your reference category). Failing to find a significant relationship does not prove your null hypothesis. So it is incorrect to say that you found no difference. I would take a second to read about hypothesis testing and significance tests to understand what you can say when you find a significant or insignificant effect. As a note, many statisticians recommend abandoning significance testing or at least using p-values more sparingly.

I would say something like "companies where the CEO and Chairman are distinct persons are less likely to be SST companies than NST companies (p < .05). Companies where the CEO and Chairman are distinct, compared to companies where they are not distinct, were no more or less likely to be ST companies than NST companies." Since you did not transform your coefficients from log odds to odds ratios, your interpretation of 5.4 times is incorrect. See the rrr option in mlogit to transform the log odds to relative risk ratios, which are much easier to interpret: https://www.stata.com/manuals13/rmlogit.pdf
Comment
Siege Taker

Join Date: Apr 2020

Posts: 22
#9

11 Oct 2020, 03:43

Tom Scott Thanks for the correction, the results are in coef not OR. Interpretations I see elsewhere just ignore the variables that are significant.
The hypothesis is that "There is no significant difference between factors that influence different distress states". The distress states (DV) are NST, ST, SST and SSTDelisted and factors = IV.
Since CEO_DUAL is not significant for ST vs NST, can it be concluded that it is not among the factors influencing distress at that level? WIll the "test" code add value to testing the hypothesis?
Thanks
Comment
Tiaga Falcao

Join Date: Oct 2020

Posts: 16
#10

17 Oct 2020, 14:03

Siege Taker how did you use mlogit to work with "unbalanced panel data"? as far as I know, mlogit is not suitable for models with repeated or correlated observations and its outputs are most likely biased for panel data.

Last edited by Tiaga Falcao; 17 Oct 2020, 14:07.
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2158
#11

17 Oct 2020, 20:09

Any method that can be used for cross sectional data can be used for panel data. In fact, it means one need not assume anything about the correlation across time proved one clusters the standard errors using the cross sectional identifier.
1 like
Comment
Tiaga Falcao

Join Date: Oct 2020

Posts: 16
#12

17 Oct 2020, 22:04

Originally posted by Jeff Wooldridge View Post

Any method that can be used for cross sectional data can be used for panel data. In fact, it means one need not assume anything about the correlation across time proved one clusters the standard errors using the cross sectional identifier.

Thank you Jeff Wooldridge !

Can we use Stata's mlogit for panel data estimation then?

If Yes, in the case of repeated measurements on the same experimental unit (correlated observations), how does mlogit take care of heterogeneity without "Panel ID" and "Time" inputs? (similar to logit and xtlogit)
Comment
Tom Scott

Join Date: Apr 2019

Posts: 266
#13

17 Oct 2020, 22:30

Tiaga Falcao see: https://www.stata.com/stata-news/news29-2/xtmlogit/
Comment
Tiaga Falcao

Join Date: Oct 2020

Posts: 16
#14

17 Oct 2020, 23:06

Originally posted by Tom Scott View Post

Tiaga Falcao see: https://www.stata.com/stata-news/news29-2/xtmlogit/

Thank you Tom Scott . I am aware of this Structural Equation Modeling (gsem) workaround as a replacement for xtmlogit . There are two issues with gsem method; 1) can't incorporate granger causality 2) doesn't produce goodness-of-fit measures.

Could you please look at my post here: https://www.statalist.org/forums/for...al-logit-model

Last edited by Tiaga Falcao; 17 Oct 2020, 23:10.
Comment
Tom Scott

Join Date: Apr 2019

Posts: 266
#15

18 Oct 2020, 07:07

Tiaga Falcao shouldn't you be able to calculate goodness-of-fit measures and compare alternative models using 'estat gof' after sem - https://www.stata.com/manuals/semest...f#semestatgof? SEM also allows for cross-lagged models. This article might be useful - https://journals.sagepub.com/doi/ful...94428119847278
Comment

Announcement

Interpreting results of Multinomial Logistic Regression- Panel data

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment