Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • standard error of 8280 in multinominal logit

    Hi,

    I am analysing my data using multinominal logit. Firstly sorry that I cannot post my data and full results here.

    Let call dependent variable "P3", and I have several independent variables: "treatment" , "P1", "age", "iq", "female", "mistakes", "major". The one I'm interested in is "treatment", and I think that "P1" has be included in the regression as a control. "P3" and "P1" are measuring the same thing before and after the treatment, and they have 7 categories. The sample size is small, 157, with two missing value in Female, so N=155.

    I am running into a problem of getting very large standard error of coefficient, such as 8280 of one category of P1. Almost every such large standard error happens with one of the category of P1.

    Code:
                            Coef.    Std.  Err.     z     P>z     [95%  Conf.  Interval]
    P1                |
                   1   |   2.514913   .7466968     3.37   0.001     1.051414    3.978412
                    2  |   1.361554   1.327665     1.03   0.305    -1.240622    3.963729
                    3  |   .9366774   8280.394     0.00   1.000    -16228.34    16230.21
                    4  |  -.3395546   4124.867    -0.00   1.000     -8084.93    8084.251
                    -1  |  -.6942527   1.267946    -0.55   0.584    -3.179382    1.790877
                    -2  |   19.35054   11956.97     0.00   0.999    -23415.88    23454.58
    I looked at the cross-table of P1 and P3, and found there are some empty cells. The partial table looks like this.

    Code:
    
    P1        |                                P3
                |        -2         -1          0          1          2          3          4 |     Total
    -----------+-----------------------------------------------------------------------------+----------
             3 |         0          0          0          0          1          2          2 |         5
             4 |         0          0          0          0          0          3        13 |        16
    -----------+-----------------------------------------------------------------------------+----------
    I am wondering if these empty cells cause the enormous standard error. I know that the sample size is very small, and the number of independent variable are relatively too large to sample size, should I switch to -firthlogit-?

    Thanks for any help!!
    Last edited by Jasmine Xu; 30 Dec 2019, 10:38.

  • #2
    Yes, the empty cells, and for that matter, the cells with 1, 2, or 3 observations, are the cause of your large standard deviations.

    As far as I know, -firthlogit- does not support multinomial logistic regression. Even if it did, or if you found some other program to do penalized maximum likelihood estimation for multinomial logistic regression, your data are not remotely adequate to the task you are trying to accomplish. The data are too small and too sparse to support this many subdivisions into categories. Either get more data, or better data, or simplify your analysis to a suitably small number of categories and variables.

    Comment


    • #3
      Thanks for your prompt reply Clyde!

      The variable P1 and P3 I have are actually ordered variable if I exclude the -1 and -2 category. Is it appropriate to use ordered logit with the data I have if I excluded these two categories?

      Comment


      • #4
        That would be a bit better, but I doubt the results would be truly satisfactory. You can try it, but don't be surprised if the standard errors are still very large.

        Comment

        Working...
        X