mixlogit

EL KIMON

Join Date: May 2016
Posts: 73

23 Jun 2016, 09:53

Dear all,
i need help about interpreting the output of the results because i am new user of the specific program.
i am estimating a multinomial logit model by the command mixlogit in Stata12 as done in Arne Hole's paper in the stata journal(2007). I have individuals for specific year and each of them have 3 discrete choice hours of working (h_c).
I have the following results:

Code:

 
mixlogit didep h_c, rand(disposable_income) id(idperson) nrep(50) group(strata)

Iteration 0:   log likelihood = -12133.284  (not concave)
Iteration 1:   log likelihood = -11902.372  (not concave)
Iteration 2:   log likelihood = -8524.7568  (not concave)
Iteration 3:   log likelihood = -6842.8411  (not concave)
Iteration 4:   log likelihood =  -5643.476  
Iteration 5:   log likelihood = -5486.2297  
Iteration 6:   log likelihood = -5471.7848  
Iteration 7:   log likelihood = -5471.6168  
Iteration 8:   log likelihood = -5471.6168  

Mixed logit model                                 Number of obs   =      17319
                                                         LR chi2(1)      =       0.00
Log likelihood = -5471.6168                       Prob > chi2     =     0.9837

------------------------------------------------------------------------------
       didep |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
Mean         |
         h_c |  -.0922044   .0028096   -32.82   0.000    -.0977112   -.0866976
disposable~e |    .000869   .0000741    11.72   0.000     .0007237    .0010143
-------------+----------------------------------------------------------------
SD           |
disposable~e |   1.80e-06   .0000881     0.02   0.984    -.0001709    .0001745
------------------------------------------------------------------------------

I included only some variables because i want to understand the meaning of how it works.
Could someone explain how the meaning of the results? Should i go further and get the marginal ?

Thanks,
Best regards,
Kimon

Tags: None

Clyde Schechter

Join Date: Apr 2014

Posts: 30119
#2

23 Jun 2016, 10:42

I have never used -mixlogit-, so I will not be able to give you a definitive answer here. But unless -mixlogit- handles things very differently from other Stata commands (and I see nothing in its -help- file to suggest that it does), you have improperly specified h_c if it is supposed to be a 3-level discrete variable. Stata estimation commands generally will interpret your specification as wanting h_c to be treated as a continuous variable. In current Stata commands, to specify a discrete variable you would use i.h_c instead. I gather that -mixlogit- was written for version 12, and I cannot tell from the help file whether it supports factor-variable notation. If it does, you should use i.h_c. If it does not, then you should generate indicator ("dummy") variables for two of the levels of h_c and use them instead of h_c itself in the model.

That said, the overall interpretation of the results of the model you do have is that the average slope of the log-odds of didep = 1 vs disposable income is 0.000869. But each individual (id) has his or her own slope, and those are modeled as being normally distributed around that average, the distribution having a standard deviation of 1.8 X 10^-6.

If my concerns in the first paragraph are correct, however, you should not rely on the interpretation I have given in the second paragraph, as the results will likely change when you correct the specification of h_c in the model.

I hope this is helpful.
Comment
EL KIMON

Join Date: May 2016

Posts: 73
#3

23 Jun 2016, 23:04

Dear sir,
Thanks for answering. I did a mistake with my writing. The variable h_c it is a continuous variable i just wanted to say that there is in my model 3 discrete choices of hours.
My question is why in this model it gives you output the mean values and the sd and is not like the regular regression results?
Thanks.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30119
#4

24 Jun 2016, 07:05

Because, unlike ordinary regression, this is a random slopes model. In an ordinary logistic regression such as

Code:

logit didep h_c

you are trying to find the best possible value of b1 to fit the equation log odds(didep) = b0 + b1*h_c. In particular, you are looking only for a single value of b1 which applies to all people. Otherwise put, b1 is a constant.

In the -mixlogit- you are trying to fit log odds(didep) = b0 + r1*h_c, where r1 is a normally distributed random variable, and you are trying to find the mean and standard deviation of that distribution. So that is what -mixlogit- gives you: estimates of the mean and standard deviation of that random variable.

I think before you proceed further, you need to get familiar with random effects models. The [ME] manual that comes with your Stata installation might be a good place to start.

Last edited by Clyde Schechter; 24 Jun 2016, 07:06. Reason: Correct typo.
Comment
Anat Tchetchik

Join Date: Jun 2014

Posts: 217
#5

24 Jun 2016, 17:25

El, Note the the SD is not the standard error of the coefficient, disposable, in this case, it is it's standard deviation in the population. By assigning this variable to the rand( ) option, you are assuming that it's parameter is normally distributed in the population and you want to know to what extent it is dispersed around the mean ( .000869). In your case, the result implies that the SD of disposable is not significant, meaning that given your data, there is no point in assigning this var as a random parameter.
Comment
EL KIMON

Join Date: May 2016

Posts: 73
#6

25 Jun 2016, 09:38

First of all, thanks both of you for the reply.
For the fixed variable h_c the coefficient suggest that one unit increase is associated with a 0.09 unit decrease in the expected log odds of didep. Althoug, the table notes mean this is the meaning correct? (because is one value to respond to all people ).

About the random variable the sd is not significant. Although, the meaning from the mean coefficient translated as one unit increase is associated with a 0.0008 increase in the expected log odds of the didep? About the sd in this case model it gives me the information if i did correct of choosing the specific variable as random? (So, in my case the disposable income variable should not treated as random).
What about if it was significant?
Thank you very much.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30119
#7

25 Jun 2016, 10:27

It does not make sense to look at the "statistical significance" of this SD value. That is a test of the null hypothesis that the standard deviation of the random slope is zero. But a standard deviation is essentially never zero, except when you are dealing with a constant. So this is the ultimate straw-man hypothesis test. What is more useful is to compare the standard deviation to the mean value. In your case, the standard deviation is four orders of magnitude smaller than your mean value. Even if you use the upper 95% confidence limit of the standard deviation, that is nearly 3 orders of magnitude smaller. So what this tells us is that the between-person (or whatever these units of analysis are) variation in the slope of h_c is probably around one one-hundredth of one percent of the mean value of that slope, and is unlikely to be more than one percent of the mean value. In almost any context, I would interpret that as saying that the variation is negligible and that a model that used a constant slope rather than a random slope would be just as good for nearly any purpose.
1 like
Comment

Anat Tchetchik

Join Date: Jun 2014
Posts: 217

25 Jun 2016, 14:33

Clyde thank you for the clarification, indeed the standard deviation is essentially never zero, but what will be the correct inference if the SD is relatively large (as in the example below for the var. 'morethan10') and yet not significant?

Mixed logit model	Number of	obs	=5562
LR	chi2(6)	=803.32
Log	likelihood	=-1370.01
Prob	>	chi2	=0

choice	Coef.	Std. Err.	z	P>\|z\|	[95%	Conf.Interval]
-------------+----------------------------------------------------------------
Mean
pricea	0.000	0.000	-12.010	0.000	0.000	0.000
luxury	0.323	0.109	2.970	0.003	0.110	0.537
maxspeed1	0.005	0.002	2.230	0.025	0.001	0.009
drivingange	0.000	0.005	-0.020	0.984	-0.010	0.009
electric	-0.995	0.995	-1.000	0.317	-2.944	0.955
hybrid	-0.076	0.193	-0.390	0.694	-0.454	0.303
upgraded	0.353	0.106	3.330	0.001	0.145	0.561
few	0.421	0.151	2.790	0.005	0.125	0.716
morethan10	0.594	0.149	3.970	0.000	0.301	0.887
chargingtime	-0.106	0.066	-1.620	0.105	-0.235	0.022
-------------+----------------------------------------------------------------
SD
electric	3.228	0.305	10.570	0.000	2.629	3.827
hybrid	2.833	0.208	13.600	0.000	2.425	3.241
upgraded	0.437	0.188	2.330	0.020	0.069	0.806
few	-0.586	0.258	-2.270	0.023	-1.092	-0.080
morethan10	-0.021	0.354	-0.060	0.952	-0.716	0.673
chargingtime	0.074	0.049	1.520	0.129	-0.022	0.170

Comment

EL KIMON

Join Date: May 2016

Posts: 73
#9

25 Jun 2016, 14:57

So, in my case according to sd better treat the variable disposable income as a fixed variable?

Anatmanes in your example output what about the variable hybrid? The mean is not aignificant.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30119
#10

25 Jun 2016, 15:06

I would take that to mean that for whatever reason, the data do not provide an adequately precise estimate of the group level variation in the morethan10 slope to draw any conclusions about it. One possible reason is that it may be difficult in the data to distinguish effects of variation in morethan10 slope from variation in something else. (That something else might be at either level of the model.)
1 like
Comment
Anat Tchetchik

Join Date: Jun 2014

Posts: 217
#11

25 Jun 2016, 17:27

Thank you very much Clyde!
Comment

Announcement

mixlogit

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment