Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Logistic Panel Regression with Fixed Effects and Interaction Term does not reach Convergence

    Hello everyone,

    I'm using (unbalanced) Panel data containing 62,231 persons observed annually since 1984 (479,720 observations in total). See an example of the data below.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double(pid syear) float dison double lfst
     1601 1984 0 1
     1601 1986 0 1
     1601 1987 0 1
     1601 1989 0 1
     1601 1990 0 1
     1601 1991 0 1
     1601 1993 0 1
     1601 1994 0 1
     1601 1995 1 1
     1601 1996 2 1
     1601 1997 3 1
     1601 1998 4 1
     1601 1999 5 0
     1601 2000 5 0
     1601 2001 5 0
     1601 2002 5 0
     1601 2003 5 0
     1601 2004 5 0
     1601 2005 5 0
    57401 1984 0 1
    57401 1985 0 1
    57401 1986 0 1
    57401 1987 0 1
    57401 1988 1 1
    57401 1989 2 1
    57401 1990 3 0
    57401 1991 4 0
    57401 1992 5 0
    57401 1993 5 0
    57401 1994 5 0
    57401 1995 5 0
    57401 1996 5 0
    end
    label values dison dison
    label def dison 0 "Not (yet) disabled", modify
    label def dison 1 "1 year before onset", modify
    label def dison 2 "Year of onset", modify
    label def dison 3 "1 year after onset", modify
    label def dison 4 "2 years after onset", modify
    label def dison 5 "3 or more years after onset", modify
    label values lfst lfst
    label def lfst 0 "Non-working", modify
    label def lfst 1 "Working", modify
    I want to regress labour force status ("lfst": working vs. non-working) on disability-onset ("dison": categorical variable capturing consecutive years of being disabled).
    Additionally (and most importantly), I also want to include an interaction effect between disability-onset and education ("educ1": low vs. middle vs. high) to test whether the onset of disability has different effects across these educational groups.

    I'm using a fixed effects model to adjust for time-constant heterogeneity on the person-level.
    At first I run the following regression:
    Code:
    xtlogit lfst c.age##c.age ib0.dison##ib0.educ1, fe vce(oim)
    leading to the follwoing ouput:
    Code:
    Conditional fixed-effects logistic regression   Number of obs     =    283,726
    Group variable: pid                             Number of groups  =     25,522
    
                                                    Obs per group:
                                                                  min =          2
                                                                  avg =       11.1
                                                                  max =         35
    
                                                    LR chi2(17)       =   27495.88
    Log likelihood  = -99813.358                    Prob > chi2       =     0.0000
    
    ----------------------------------------------------------------------------------------------------------------
                                              lfst |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -----------------------------------------------+----------------------------------------------------------------
                                               age |   .5299966   .0040154   131.99   0.000     .5221266    .5378666
                                                   |
                                       c.age#c.age |  -.0063246   .0000482  -131.16   0.000    -.0064191   -.0062301
                                                   |
                                             dison |
                              1 year before onset  |  -.5982718   .0643514    -9.30   0.000    -.7243983   -.4721454
                                    Year of onset  |  -1.339312   .0636188   -21.05   0.000    -1.464002   -1.214621
                               1 year after onset  |  -1.686735   .0781696   -21.58   0.000    -1.839944   -1.533525
                              2 years after onset  |  -1.893862   .0862354   -21.96   0.000     -2.06288   -1.724844
                      3 or more years after onset  |  -2.417689   .0640541   -37.74   0.000    -2.543232   -2.292145
                                                   |
                                             educ1 |
                                Middle (Casmin 2)  |          0  (omitted)
                                  High (Camsin 3)  |          0  (omitted)
                                                   |
                                       dison#educ1 |
            1 year before onset#Middle (Casmin 2)  |   .0879332   .1068131     0.82   0.410    -.1214166    .2972829
              1 year before onset#High (Camsin 3)  |   .3068141   .1475799     2.08   0.038     .0175628    .5960653
                  Year of onset#Middle (Casmin 2)  |   .1638639   .1024347     1.60   0.110    -.0369045    .3646322
                    Year of onset#High (Camsin 3)  |   .3921406   .1401097     2.80   0.005     .1175306    .6667506
             1 year after onset#Middle (Casmin 2)  |    .253682   .1260596     2.01   0.044     .0066097    .5007543
               1 year after onset#High (Camsin 3)  |   .3593408   .1649219     2.18   0.029     .0360999    .6825818
            2 years after onset#Middle (Casmin 2)  |   .2646515   .1368333     1.93   0.053    -.0035369    .5328398
              2 years after onset#High (Camsin 3)  |   .4035622   .1794544     2.25   0.025     .0518379    .7552864
    3 or more years after onset#Middle (Casmin 2)  |   .6180363   .0960988     6.43   0.000      .429686    .8063865
      3 or more years after onset#High (Camsin 3)  |   .7717205   .1371187     5.63   0.000     .5029728    1.040468
    ----------------------------------------------------------------------------------------------------------------
    Of course, educ1 is omitted since it is time-constant. That's why I wanted to run another regression with just dison and the interaction effect:
    Code:
    xtlogit lfst c.age##c.age ib0.dison ib0.dison#ib0.educ1, fe
    However, this model does not reach convergence (I stopped at around 90 iterations), which leaves me clueless since the model should be basically the same as the previous.
    (see https://www.statalist.org/forums/for...fe-and-margins for discussion of this).

    Has anyone an idea on why this problem occurs and how to possibly fix it?

    Thanks in advance and best regards
    Thorben

    PS: I'm using Stata 14

  • #2
    Hi Thorben,

    Did you ever manage to solve this issue? I'm stumbling with a similar issue for my fixed-effects model, and any guidance would be massively appreciated!

    Thanks,
    Chloe

    Comment


    • #3
      I'm surprised that nobody responded to #1, if only to advise that showing the results of the regression that did converge but not showing the output from the one that didn't converge doesn't provide any information about what is going on with the non-converging regression.

      My advice to #2 is this: non-convergence problems are hard to resolve, and they are impossible to solve without sufficient information. Look over the output you got from your non-converging model Take a look at where the iterations failed to progress. (i.e. the likelihood ratio stopped increasing and either stayed in place or started going around in circles). Note down the number of the iteration where that failure is first identified. Then re-run the same command, but add an -iterate(#)- option, where you replace # by a number that is just a slight bit larger than the number of the iteration where convergence starts to fail. That will cause the model to run up to the #th iteration and then, without converging, print an output table. You cannot use that output table as results: the regression still did not converge and the numbers in the table are wrong. But they often provide valuable clues as to the cause of the non-convergence. Post that output table back here in the Forum, and provide a description of what your variables are (similar to what #1 did). It is possible that somebody will be able to identify the cause of non-convergence from inspecting all of that, or suggest other things you can try.

      Comment

      Working...
      X