Interpretation of Multinomial Logit Model

Affan Hameed

Join Date: Oct 2015
Posts: 9

Interpretation of Multinomial Logit Model

07 Nov 2015, 14:50

Dear Folks,

I am running a multinomial logit model for my research. I am creating a categorical variables (dummies) for industries and for advisor.

First of all, how do we calculate the probability as most of the text books use some calculation or newer version of Stata will give us that probabilities straight away in order for u to interpret. If that's the case then how do we interpret? I used this command for the marginal effects.
margins, dy/dx (*) at means predict (pr outcome (2)))

My independent variable are choice of performance measure to be used either (Outcome 1) ROA exclusively, (Outcome 2) ROE exclusively, (Outcome 3) ROE & ROA jointly and Outcome 4( neither ROE nor ROA

X variables consists of the control variables market capitalization, volatility in ROA, volatility in ROE, industry dummies and some of advisor dummies

margins, dydx(*) atmeans predict(pr outcome (2)	dy/dx	Std.	z	P>z [95%	Conf.	Interval]
return on asset volatility (ROAV)	-0.04423	0.026128	-1.69	0.09	-0.09544	0.006978
Return on equity volatility (ROEV)	-0.32427	0.19455	-1.67	0.096	-0.70558	0.057037
board committee lb	0.097636	0.071001	1.38	0.169	-0.04152	0.236794
nominatee committee % lnc	-0.00397	0.001804	-2.2	0.028	-0.00751	-0.00043
Leverage lev	-0.00052	0.000323	-1.6	0.109	-0.00115	0.000116
Price to book ptb	0.00661	0.003876	1.71	0.088	-0.00099	0.014206
Market Cpaitalization lmc	-0.04825	0.011329	-4.26	0	-0.07046	-0.02605
Bain (Dummy Advisor)	0.095134	0.070328	1.35	0.176	-0.04271	0.232973
Mckinsey (Dummy Advisor)	0.250295	0.058871	4.25	0	0.13491	0.36568
BG (Dummy Advisor)	0.121386	0.074217	1.64	0.102	-0.02408	0.266848
Towers (Dummy Advisor)	0.118271	0.059905	1.97	0.048	0.000861	0.235682
Mercer (Dummy Advisor)	0.591537	0.135875	4.35	0	0.325226	0.857848
Pwc (Dummy Advisor)	0.119648	0.064528	1.85	0.064	-0.00683	0.246121
Food Service Industry (Dummy)	0.046763	0.03126	1.5	0.135	-0.01451	0.108032
Customer Service Industry (Dummy)	0.047415	0.017949	2.64	0.008	0.012236	0.082594
Car's Industry (Dummy)	-0.20241	0.046227	-4.38	0	-0.29301	-0.11181
Genearl Retailers (Dummy)	0.117296	0.02383	4.92	0	0.070591	0.164001
Aerospace Industry (Dummy)	0.084182	0.018928	4.45	0	0.047084	0.12128
Minning Industry (Dummy)	-0.16032	0.027304	-5.87	0	-0.21383	-0.1068
Agriculture Industry (Dummy)	-0.14285	0.022954	-6.22	0	-0.18784	-0.09786
Food court Industry (Dummy)	-0.13902	0.021214	-6.55	0	-0.1806	-0.09744

global ylist
global xlist roev roav lb lnc lev pth lmc Bain (dummy advisor) Mckinsey (dummy advisor ) BG (dummy advisor) Industries dummy)..... etc
* Multinomial logit model with base outcome the most frequent alternative
mlogit $ylist $xlist

margins, dydx(*) atmeans predict(pr outcome(1))
margins, dydx(*) atmeans predict(pr outcome(2))
margins, dydx(*) atmeans predict(pr outcome(3))
margins, dydx(*) atmeans predict(pr outcome(4))

How does it work with the interpretation of dummy Bain advisor? Is it relative to all other advisors ? Do we have to find probabilities or Stata calculates for us?
I also think we can run industry and time effects together?

Here goes the code

Thanks,

Tags: None

Clyde Schechter

Join Date: Apr 2014

Posts: 29966
#2

08 Nov 2015, 14:01

How does it work with the interpretation of dummy Bain advisor? Is it relative to all other advisors ?

No. Assuming you created your dummies in the usual way, or, better yet, used factor variables (help fvvarlist), it is relative to the advisor for whom no dummy was created (and who is coded 0 on all of the dummies).

Do we have to find probabilities or Stata calculates for us?

You specified the -dydx()- option to -margins-, so Stata computes marginal effects, as you requested. If you want the probabilities, use -margins, predict(pr outcome(1))- etc. without the -dydx()- option.
Comment
Affan Hameed

Join Date: Oct 2015

Posts: 9
#3

08 Nov 2015, 16:34

Dear Clyde,

thanks for the answer. Yes, I also thought that way as well. But, what if I do not want to include all dummies of industries or advisor? Then, what should we do?
I will create a factor variable and then email.

Since, it calculates marginal effects for us, as we donot have to calculate by hand but how to interpret the (change in y to change in x) for a dummy variables. I hope to hear back.

Thanks,
Regards,
Affan
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29966
#4

08 Nov 2015, 17:51

I don't know if I understand what you want to do. You can't use only part of the model to make predictions. You can't have a model that includes variables X1, X2, and X3 and then make valid predictions using only the results for X1 and X2.

So if you have decided you're not actually interested in some of the industries or advisors, then you have to leave them out of the model, and then run -margins- on the new, reduced model. Note that this will change the interpretation of the corresponding regression coefficients. When all advisors are included (except one reference category), then the coefficients express effects (in the logit scale) of a given advisor relative to that one reference advisor. If several advisors are omitted, then the coefficients express effects (in the logit scale) of the given advisor relative to the combined outcome for all of the omitted advisors. Unless those omitted advisors are a fairly homogeneous group that can reasonably treated as if they were a single advisor (in terms of the outcomes they produce), this is a somewhat hazardous way to proceed. It will be hard to know what those coefficients mean in the real world.

In general, the -dydx()- output from -margins- should be interpreted as the expected difference between the probability of the designated outcome when the dummy variable = 1 and that probability when the dummy variable = 0.

I will create a factor variable and then email.

Please do not email me about this; whatever questions you have should be posted here on the forum. I have participated in Statalist (including its predecessor list serve) for over 20 years now. If I have developed expertise in using Stata, it is in part because of all that I learned from reading the questions and answers posted by others. I want to give others the same opportunity to learn Stata that I had. So it is my policy not to respond to private requests (whether via email or through the forum's messaging system) for advice on using Stata. I believe that most of the more senior, frequent responders on the forum feel the same way.
Comment
Affan Hameed

Join Date: Oct 2015

Posts: 9
#5

08 Nov 2015, 18:36

Dear Clyde,

Thanks for getting back again. I never meant to email you but to post on the stata group. I can understand that we cannot leave some of the advisor dummies out of the model.

But, incase imagine I have put all dummies for advisor. So, what is good interpretation of the variable tower watson advisor?

Since, it's a non linear model so whether marginal effects or margin option, which is good to interpret in terms of change (just like we do it for linear model)

Does the variable towerwatson implies that firms that firms that use towerwatson as an advisor have 12% higher chance of choosing Outcome 2 (ROE) than base outcome which is ROE and ROA jointly relative to all other (advisors) or (the one who is left out from the model). Since, advisor is dummy so how do we interpret relative to others. Let me put all dummies and post it here.
You are actually understanding what I am referring to.

Thanks,
Regards,
Affan

Last edited by Affan Hameed; 08 Nov 2015, 18:46.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29966
#6

08 Nov 2015, 19:41

But, incase imagine I have put all dummies for advisor. So, what is good interpretation of the variable tower watson advisor?

The regression coefficient represents the expected difference in the log odds of the selected outcome between observations with tower watson advisor dummy = 1 and those observations with the reference (omitted) advisor, presented in the log-odds metric. As this is a non-linear model, in the probability metric, the expected difference between observations with tower watson advisor dummy = 1 and the reference (omitted) advisor will depend on the values of the other predictor variables. The tower watson advisor row of the output from your -margins, dydx(*) atmeans- command gives you the expected difference in probability of the selected outcome, conditional on all of the other predictor variables being set to their estimation sample means, between observations with tower watson advisor and observations with the reference (omitted) advisor. Everything you get from this model will be relative to the omitted advisor.

If what you want is relative to all other advisors, then you need a different model in which you include the tower watson advisor dummy, but no other advisor dummies. In that model, all of your coefficients and marginal effects will contrast watson advisor vs all other advisors combined. Looking at the output you posted in #1, if that is output of the actual analysis, it looks like tower watson is kind of in the middle of the pack, with Mercer having a much bigger effect, the omitted advisor having a substantially smaller one, and the others all rather close to each other. That being the case, depending on the frequencies with which these advisors occur in the data, it might well turn out that contrasting tower watson with all others in this way will get you a result close to zero.
Comment
Affan Hameed

Join Date: Oct 2015

Posts: 9
#7

08 Nov 2015, 20:07

Many thanks for the detailed answer. So, if we put all dummies then we use factor variable.But, seems its better to put the all dummies and the stata will leave one dummy out from it. But, only issue is which is better interpretation, can we comment on the size of the coefficient in terms of dy/dx for discrete variables like dummy? If yes, how we interpret? Is it margin which gives better answer or is it dy/dx marginal effects? I will post the new output as I am getting a hard time on commenting about the variables.

Not only we say towerwatson relative to omitted advisors but, since it's multinomial logit model, do we also talk about the choice of choosing ROE relative to ROA & ROE jointly as a measure being used.Again, the coeffiicent which is better interpretation?

Thanks,
Regards,
Affan

Last edited by Affan Hameed; 08 Nov 2015, 20:10.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29966
#8

08 Nov 2015, 20:54

The output of -margins- gives you predicted probabilities. The output of -margins, dydx()- gives you marginal effects. I can't tell you which one is a "better answer" because they are answers to different questions, and I don't know what question you are seeking to answer. It may also be that you have a question that could be answered equally well with either predicted probabilities or marginal effects, and in that case your choice might be based on what will be more familiar to your particular audience.

If you omit the -predict(outcome(2))- and similar options from your -margins- and -margins, dydx()- commands, Stata's output tables will include the predicted probabilities for each outcome, as well as the marginal effects on each outcome. I personally find that easier to work with than separate tables for each outcome. The marginal effects Stata computes are the incremental effect of the predictor on the probability of that particular outcome and are not relative to a baseline outcome, nor do they take into account the other outcomes. The predicted probabilities are, of course, not relative to anything.
Comment
Affan Hameed

Join Date: Oct 2015

Posts: 9
#9

11 Nov 2015, 19:41

I can understand and I will comment on that. I have run the model but have a question.

I am not sure whether multinomial logit model is a right model to use.

Since our dependent variable is type of performance measure being used in a given yaer. Let’s look at data description. Components of compensation packages (types of equity compensation) consist of options, restricted stock shares, matching plans and others. Not all firms use all four types of equity compensation in a given year. Some firms use one, two or even three components of compensation. These are performance conditions within these components and if firms meet the performance target then executives can receive that reward and equity vests.
In real world, In our data some firms are only giving restricted stock shares then it has only one performance condition. That performance condition can be one of them either return on asset (ROA) exclusively, return on equity (ROE) exclusively, return on asset and return on equity jointly, nor return on asset nor return on equity.

Some firms are using two components of equity compensation in a year, one is option or restricted stock share.
The performance conditions in restricted stock share and options are similar. Either ROA, ROE, ROE & ROA jointly and nor ROE & ROA
The problem arises when in our data some firms give two different performance conditions with in two different components of equity compensation in a year. The performance targets in both of them are different. For e.g., In restricted stock shares, firm’s use ROE only while in options they ROE & ROA jointly. Two different performance conditions with in two different long term incentive plans for a firm within a same year.

Although, we are excluding these firms from the sample voluntarily (they are smaller in number). Not only this but also the presence of return on equity and return on asset jointly violates the assumption of independence of irrelevant alternatives (IIA). Nested if is not possible since I think as we do not have any alternative specific variables. That’s what I think any comments? Thanks in advance

Last edited by Affan Hameed; 11 Nov 2015, 20:27.
Comment

Announcement

Interpretation of Multinomial Logit Model

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment