Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Error using Femlogit: "Hessian is not negative semidefinite r(430);"

    Dear all,

    For my thesis I analyse the effects of mental and physical health problems on full-time work, part-time work, and retirement decisions in Europe using the Survey of Health, Ageing and Retirement in Europe (SHARE) data. SHARE is a panel dataset on individual health and socio-economic characteristics. In my model, the dependent variable takes three unordered outcomes: full-time work, part-time work, and retirement. The outcome full-time is coded by 1, part-time is coded by 2, and retirement by 3. The deterministic part is a function of a set of socio-demographic characteristics and lagged health indicators. I also consider dummies which are indicators of whether individuals have reached their age for receiving retirement benefits.

    Since the data I use is a panel, I am using the femlogit command in Stata to control for unobserved heterogeneity in a multinomial logit model with fixed effects. However, when I execute the femlogit command, I get the error "Hessian is not negative semidefinite r(430);". I have tried executing the command with different sets of variables, and it turns out that if I omit the variables age and age squared (age age2), I do not receive the error anymore. However, age is an important variable I do need to keep it in my regression.

    The link "r(430)” I get in the error refers me to a convergence problem but I do not know how to get around the problem. I have tried with different specifications but it seems the error is not related to collinearity. That is, I drop all the variables and regress my dependent only on age using the command "femlogit hrsstatus age, baseoutcome(3)”. The error repeats itself. With only one variable in the model, it seems collinearity is not the issue.

    I would like to ask if there is anything I can do?

    Thanking in advance!
    Lieke

  • #2
    I'd probably start by looking at the age data. Can you run
    Code:
    bro hrsstatus age
    and also
    Code:
    twoway scatter hrsstatus age
    (encoding hrsstatus if it isn't already numeric)?

    Not a solution, but a first step.

    Comment


    • #3
      Click image for larger version

Name:	scatter.png
Views:	1
Size:	11.1 KB
ID:	1396505


      This is the scatterplot.
      However, I do not know how this would help with the error?
      Attached Files

      Comment


      • #4
        Have you tried age alone? If yes and you get a solution, then there might be a problem with the scaling of your variables. Try to divide age by 10 (or 100) and generate age squared with this new variable. Maximum likelihood can in general be sensitive to the scaling of your variables and outliers.
        So try also to check if there outliers or absurd observations which.

        Hessian not negative definite could be either related to missing values in the hessian or very large values (in absolute terms). Try to set the maximize option so that you can get a trace of the the parameters , the gradient and the hessian to see if you end up in an region with absurd parameters.

        Aside: femlogit is a user-written package and Statalist's FAQ require that you mention that see the FAQs # 12.1

        Comment


        • #5
          Originally posted by Christophe Kolodziejczyk View Post
          Have you tried age alone? If yes and you get a solution, then there might be a problem with the scaling of your variables. Try to divide age by 10 (or 100) and generate age squared with this new variable. Maximum likelihood can in general be sensitive to the scaling of your variables and outliers.
          So try also to check if there outliers or absurd observations which.

          Hessian not negative definite could be either related to missing values in the hessian or very large values (in absolute terms). Try to set the maximize option so that you can get a trace of the the parameters , the gradient and the hessian to see if you end up in an region with absurd parameters.

          Aside: femlogit is a user-written package and Statalist's FAQ require that you mention that see the FAQs # 12.1
          Thank you for your post and my excuses for not mentioning that femlogit is user-written.

          I have indeed tried a regression with age alone, but I still get the same error. That is, I drop all the variables and regress my dependent only on age using the command "femlogit hrsstatus age, baseoutcome(3)”. The error repeats itself. So I think that there is not a problem with the scaling of my variables. Also, as far I can see there are no outliers or absurd observations.

          Secondly, I tried the difficult option of the femlogit, but also with this option I get the same error. I do not know how to get a trace of the hessian when using the femlogit command. Do you know how I can see the Hessian in Stata with the femlogit?

          Comment


          • #6
            Moreover, I have created three dummies which indicate whether a respondent works full-time, part-time, or is retired:

            1. hrszerotime: 1 if retired, 0 otherwise
            2. hrsfulltime: 1 if working full-time, 0 otherwise
            3. hrsparttime: 1 if working part-time, 0 otherwise

            Then, I take these dummies as dependent, and run the following regressions:

            Code:
            xtlogit  hrszerotime age, fe
            xtreg    hrszerotime age, fe
            femlogit hrszerotime age
            
            xtlogit  hrsparttime age, fe
            xtreg    hrsparttime age, fe
            femlogit hrsparttime age
            
            xtlogit  hrsfulltime age, fe
            xtreg    hrsfulltime age, fe
            femlogit hrsfulltime age
            For hrsfulltime and hrsparttime, the three models (femlogit, xtlogit, xtreg) are estimated successfully. Only for hrszerotime, femlogit does not estimate succesfully, while the two other FE models do. Hence, if age was perfectly collinear, I would have expected that the two other FE models also would not have estimated succesfully. Thus for me it is still not clear what the problem could be.

            Does someone has further suggestions?

            Comment


            • #7
              Lieke van Uden My suggestion to look at the gradient and the Hessian was doomed to fail, since the program does not provide this option. This is normally something you can get from commands which estimates the model by MLE and use ML, but my assumption was wrong.

              Have you tried to construct an age variable as a categorical one? There are almost no observations above 90 in hrsstatus 1 and 2, except 1 for hrsstatus = 1 with a value close to 100. I would call this an outlier. It could be that xtlogit and xtreg are able to deal with this, but not femlogit.

              Comment


              • #8
                Christophe Kolodziejczyk I have created a categorical variable for age. It does work if I regress the dependent on the categorical variable for age alone. However, if I regres my whole regression, then I still get the same error. Then I figured out that if I drop 3 variables from my regression, the categorical variable of age does work as independent (with age it still gives the hessian error). I think the problem with the 3 variables that must be dropped for not getting the hessian error, could be caused by too many missing variables.

                But I also found another solution for the "age problem". For my thesis I dropped observations with an age below 55. So, I created the following variable:

                gen age_scale = .
                replace age_scale = age - 55
                With using this age_scale variable (which starts at 0 and not at 55 as age does) for age, the femlogit does not give the hessian error anymore. So for some reason I think that the femlogit command can not deal with the fact that age does not start at 0 or 1.

                Comment

                Working...
                X