standard error of 8280 in multinominal logit

Jasmine Xu

Join Date: Jul 2019

Posts: 33
#1

standard error of 8280 in multinominal logit

30 Dec 2019, 10:29

Hi,

I am analysing my data using multinominal logit. Firstly sorry that I cannot post my data and full results here.

Let call dependent variable "P3", and I have several independent variables: "treatment" , "P1", "age", "iq", "female", "mistakes", "major". The one I'm interested in is "treatment", and I think that "P1" has be included in the regression as a control. "P3" and "P1" are measuring the same thing before and after the treatment, and they have 7 categories. The sample size is small, 157, with two missing value in Female, so N=155.

I am running into a problem of getting very large standard error of coefficient, such as 8280 of one category of P1. Almost every such large standard error happens with one of the category of P1.

Code:

Coef. Std. Err. z P>z [95% Conf. Interval] P1 | 1 | 2.514913 .7466968 3.37 0.001 1.051414 3.978412 2 | 1.361554 1.327665 1.03 0.305 -1.240622 3.963729 3 | .9366774 8280.394 0.00 1.000 -16228.34 16230.21 4 | -.3395546 4124.867 -0.00 1.000 -8084.93 8084.251 -1 | -.6942527 1.267946 -0.55 0.584 -3.179382 1.790877 -2 | 19.35054 11956.97 0.00 0.999 -23415.88 23454.58

I looked at the cross-table of P1 and P3, and found there are some empty cells. The partial table looks like this.

Code:

P1 | P3 | -2 -1 0 1 2 3 4 | Total -----------+-----------------------------------------------------------------------------+---------- 3 | 0 0 0 0 1 2 2 | 5 4 | 0 0 0 0 0 3 13 | 16 -----------+-----------------------------------------------------------------------------+----------

I am wondering if these empty cells cause the enormous standard error. I know that the sample size is very small, and the number of independent variable are relatively too large to sample size, should I switch to -firthlogit-?

Thanks for any help!!

Last edited by Jasmine Xu; 30 Dec 2019, 10:38.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30356
#2

30 Dec 2019, 10:36

Yes, the empty cells, and for that matter, the cells with 1, 2, or 3 observations, are the cause of your large standard deviations.

As far as I know, -firthlogit- does not support multinomial logistic regression. Even if it did, or if you found some other program to do penalized maximum likelihood estimation for multinomial logistic regression, your data are not remotely adequate to the task you are trying to accomplish. The data are too small and too sparse to support this many subdivisions into categories. Either get more data, or better data, or simplify your analysis to a suitably small number of categories and variables.
Comment
Jasmine Xu

Join Date: Jul 2019

Posts: 33
#3

30 Dec 2019, 10:40

Thanks for your prompt reply Clyde!

The variable P1 and P3 I have are actually ordered variable if I exclude the -1 and -2 category. Is it appropriate to use ordered logit with the data I have if I excluded these two categories?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30356
#4

30 Dec 2019, 10:46

That would be a bit better, but I doubt the results would be truly satisfactory. You can try it, but don't be surprised if the standard errors are still very large.
Comment

Announcement

standard error of 8280 in multinominal logit

Comment

Comment

Comment