Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • fmm package does not converge with survival model

    Hello,
    I am doing a parametric survival analysis with a loglogistic baseline hazard function. I will like to control for unobserved heterogeneity through a finite fixture model.

    My problem is that the estimation does not converge for models with 3 or more groups.

    I have tried to take out variabels with little variation, changing the technique, starting with different values and iterations. Nothing seems to work.

    I would therefore like to here, if anyone can elaborate on, why this is not possibel or what I am doing wrong.

    The data, you see, is generated from the sample function (10%) because of confidentiality.

    If necessary, I can also upload the entire sample data-file (Just do not know, if it is allowed in here). It experiences the same problems as the entire dataset does.

    Here is my code:
    Code:
    clear all
    cls
    
    use "..."
    
    stset TIME, failure(EVENT)
    
    global xlist VAR_1 VAR_2 VAR_3 VAR_4 VAR_5 VAR_6 VAR_7 VAR_8 VAR_9 VAR_10 VAR_11 VAR_12 VAR_13 VAR_14 VAR_15 VAR_16 VAR_17 VAR_18 VAR_19
    
    
    
    global dist logl
    
    //Base:
    fmm 1, emopts(iterate(1000)) iterate(200) technique(bhhh 30 nr 30) difficult: streg, distribution($dist)
    estimates store logl_base
    estat ic
    eststo logl_base
    estat summarize
    
    //1 group:
    fmm 1, emopts(iterate(1000)) iterate(1000) technique(bhhh 30 nr 30) difficult: streg $xlist , distribution($dist)
    estimates store en_old
    estat ic
    eststo logl_en
    estat summarize
    
    
    //2 groups:
    fmm 2, emopts(iterate(1000)) iterate(1000) technique(bhhh 30 nr 30) difficult: streg $xlist , distribution($dist)
    estimates store to_old
    estat ic
    eststo logl_to
    estat summarize
    
    
    //3 groups:
    fmm 3, emopts(iterate(1000)) iterate(1000) technique(bhhh 30 nr 30) difficult: streg $xlist , distribution($dist)
    estimates store tre_old
    estat ic
    eststo logl_tre
    estat summarize
    
    
    //4 groups:
    fmm 4, emopts(iterate(1000)) iterate(1000) technique(bhhh 30 nr 30) difficult: streg $xlist , distribution($dist)
    estimates store fire_old
    estat ic
    eststo logl_fire
    estat summarize

    Data set:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double(VAR_1 VAR_2 VAR_3 VAR_4 VAR_5 VAR_6 VAR_7 VAR_8 VAR_9 VAR_10 VAR_11 VAR_12 VAR_13 VAR_14 VAR_15 VAR_16 VAR_17 VAR_18 VAR_19 TIME EVENT)
                   85 1 0 0 0 0 0 0 0 0 0 0 0  3.891891891891892                   0 .02702702702702703 0 1 0  37 0
                   99 1 0 0 0 0 0 0 0 0 0 0 0 3.0416666666666665                   0 3.2916666666666665 1 1 1  11 1
                  100 1 0 1 0 0 0 0 0 0 0 0 0  6.857142857142857  .14285714285714285  .5714285714285714 1 1 0   7 1
                   83 1 0 0 0 0 0 1 0 0 0 0 0                3.5                 3.5               .125 1 1 0   4 1
                   45 1 0 0 0 0 1 0 0 0 0 0 0 6.0588235294117645   .8235294117647058 1.2352941176470589 1 1 1  17 1
                   76 0 0 0 0 0 0 0 0 0 0 0 1               4.75                   0                  0 1 1 1   4 0
    77.87949381650849 1 0 0 0 0 0 0 0 0 0 0 0 3.6122448979591835   .3877551020408163 .12244897959183673 0 1 0  50 1
                   89 0 0 0 0 0 0 0 0 0 0 0 0                4.4                 .75                .15 1 1 1  20 1
                   66 1 0 1 0 0 0 0 0 0 0 0 0 2.6666666666666665                 1.2 .06666666666666667 1 1 0   1 1
                   90 1 0 1 0 0 0 0 0 0 0 0 1  5.424242424242424 .030303030303030304  .3939393939393939 1 1 0  33 0
                   55 1 0 1 0 0 0 0 0 0 0 0 1                6.6                   0                 .6 0 1 1   5 1
                   89 0 0 0 0 0 0 0 0 0 0 0 1 3.7142857142857144                   0 .14285714285714285 1 1 0  14 0
                   76 1 1 0 0 0 0 0 1 0 0 0 0 3.2903225806451615   .4032258064516129 .22580645161290322 1 1 0  64 0
                   79 1 0 0 0 0 0 0 0 0 0 0 0 2.6666666666666665                   0  .2222222222222222 1 1 0   9 1
                   78 0 0 0 0 0 0 0 0 0 0 0 1  6.490196078431373                   0 .37254901960784315 0 1 0  51 0
    77.87949381650849 1 0 0 0 0 0 0 0 0 0 0 0   6.39622641509434                   0 .24528301886792453 0 1 0  53 0
                   86 1 0 0 1 0 0 0 0 1 0 0 0  6.555555555555555                   0                 .5 1 1 0  18 1
                   78 1 0 0 0 0 0 0 0 0 0 0 0                  2                 1.6                 .4 1 1 0   5 0
                   71 1 0 1 0 0 0 0 0 0 0 0 1 4.3428571428571425                   0 .17142857142857143 1 1 0   8 1
                   73 1 0 0 0 0 0 0 0 0 0 0 0               5.25                5.25                 .5 1 1 0   4 1
                   83 0 0 1 0 0 0 0 0 0 0 0 0                  7                   0                  0 1 1 1   3 1
                   34 0 0 0 0 0 0 0 0 0 0 0 0  3.608108108108108                   0 .14864864864864866 0 1 1  74 0
                   95 1 0 0 0 1 0 0 1 0 0 0 0  5.571428571428571                   0  .7142857142857143 1 1 1   7 1
                   92 1 0 0 0 1 0 1 0 0 1 0 0  6.195652173913044  .06521739130434782 .10869565217391304 1 1 1  47 0
                   87 1 0 1 0 0 0 0 0 0 0 0 0  3.857142857142857                   0 4.4523809523809526 1 1 1   2 1
                   66 1 1 0 0 0 0 0 1 0 0 0 0 3.5714285714285716   .2857142857142857 14.714285714285714 1 1 1   7 1
                   84 1 0 1 0 0 0 0 0 0 0 0 0                  7                   0                .75 1 1 1   7 1
                   74 1 0 0 0 0 1 0 0 0 0 0 0  3.727272727272727                   0 .06060606060606061 1 1 0  38 0
    77.87949381650849 1 0 1 0 0 0 0 0 0 0 0 0                  4                   0                  0 0 1 1   1 1
                   84 1 1 0 0 0 0 0 0 0 0 0 0                3.6                   0 .06666666666666667 1 1 0  15 0
                   78 1 0 0 0 0 0 1 0 0 0 0 1                  2                   0                  0 1 1 1   2 1
                   91 1 0 0 0 0 0 0 0 0 0 0 0                  4                   0                  1 1 1 0   1 1
                   84 1 1 0 0 0 0 0 1 0 0 0 0  5.857142857142857   2.761904761904762  .5714285714285714 1 1 1  21 0
                   85 1 0 1 0 0 0 0 0 0 0 0 0  2.903225806451613                   0  .0967741935483871 1 1 0  16 1
                   83 1 0 0 0 0 0 0 0 0 0 0 0               1.75                 1.5                .25 1 1 0   5 1
                   85 1 0 0 0 0 0 0 0 0 0 0 0  4.048387096774194                   0 .27419354838709675 1 1 1  62 0
                   84 0 0 0 0 0 0 0 0 0 1 0 0  6.638888888888889   6.722222222222222  .5833333333333334 1 1 1  36 1
                   84 1 0 1 0 0 0 0 0 0 0 0 0 6.6521739130434785                   0 .08695652173913043 1 1 1  23 0
                   45 0 0 0 0 0 0 1 0 0 0 0 0  4.586206896551724  .22413793103448276  .1896551724137931 1 1 1  66 1
                   93 1 0 0 0 0 0 0 0 0 0 0 0 3.7777777777777777                   0 .14814814814814814 1 1 1  27 0
                   93 0 0 0 0 0 0 0 0 0 0 0 0                5.5                   0                 .5 1 1 0   2 1
                   44 1 0 1 0 0 0 0 0 0 0 0 0                6.1                  .3                .75 1 1 0  20 1
                   67 1 0 1 0 0 0 0 0 0 0 0 0               5.25                   0                .25 0 1 0   4 0
                   88 0 0 1 0 0 0 0 0 0 0 0 1  3.673469387755102                   0  .4387755102040816 1 1 0 100 0
                   86 1 0 0 0 0 0 0 0 0 0 0 1                  2                   0                  7 1 1 0   3 1
                   86 0 0 0 0 0 0 0 0 0 0 0 0  4.150943396226415                   0  .2830188679245283 1 1 0  32 1
                   85 0 0 0 0 0 0 0 0 0 0 0 0                6.5                   0                6.5 1 1 0   2 1
                   94 0 0 0 0 0 0 0 0 0 0 0 1  5.912087912087912                   0 .21978021978021978 1 1 0   6 1
    77.87949381650849 1 0 0 0 0 0 0 0 0 0 0 0                  4                   1                  0 0 1 1   1 1
                   77 0 0 1 0 0 0 0 0 0 0 0 0                  7                   0                 .5 1 1 1   2 1
                   62 0 1 0 0 0 0 0 0 0 0 0 1  10.11111111111111  10.444444444444445  18.88888888888889 0 1 1  10 0
                   67 0 0 0 0 0 0 0 0 0 0 0 0                6.6                   0                 .7 1 1 0  40 1
                   93 1 1 0 0 0 0 0 1 0 0 0 0  4.033333333333333                 .05 .06666666666666667 1 1 0  62 0
                   39 1 0 1 0 0 0 0 0 0 0 0 0                  6                   4  .3333333333333333 1 1 1   3 1
                   89 1 0 1 0 0 0 0 0 0 0 0 0  4.085714285714285                  .2 .24285714285714285 1 1 1  70 0
                   87 0 0 0 0 0 0 0 0 0 0 0 0 3.8285714285714287  .02857142857142857 .11428571428571428 1 1 0  70 0
                   82 1 0 0 0 1 0 0 0 0 0 0 0 3.6666666666666665                   0  .0784313725490196 1 1 0  52 0
                   73 1 0 0 0 0 0 0 1 0 0 0 0              6.625                4.25             7.1875 1 1 0  16 1
                   81 0 0 0 0 0 0 0 0 0 0 0 0 3.0638297872340425   3.106382978723404 2.0638297872340425 1 1 0  40 1
                   83 1 0 1 0 0 0 0 0 0 0 0 0   6.47457627118644  .03389830508474576  .5423728813559322 1 1 1  59 0
                   83 1 0 1 0 0 0 0 0 0 1 0 0                  4                   0                 .4 1 1 1   5 0
                   90 1 0 0 0 0 0 0 0 0 1 0 0 3.8536585365853657                   0 .04878048780487805 1 0 1  41 0
                   84 1 0 0 0 0 0 0 0 1 0 0 1  3.923076923076923                   0 .07692307692307693 1 1 0  27 1
                   54 1 0 1 0 0 0 0 0 0 0 0 0               3.52                 .12                .06 1 1 0  50 0
    77.87949381650849 0 0 0 0 0 0 0 0 0 0 0 0  2.090909090909091                   4                  0 0 1 0  14 1
                   79 0 0 0 0 0 0 0 0 0 0 0 1 3.8947368421052633  1.7894736842105263  .3684210526315789 1 1 0  12 1
                   77 0 0 0 0 0 0 0 0 0 0 0 0  3.789473684210526                   0 .10526315789473684 1 1 1  13 1
                   76 0 0 1 0 0 0 0 1 0 1 0 0  4.666666666666667                   0  .6666666666666666 1 1 1   3 0
                   85 0 0 0 0 0 0 0 0 0 0 0 1  6.527472527472527                   0 .34065934065934067 0 1 0  91 0
                   94 0 0 0 0 0 0 0 0 0 0 0 1                  4  1.6666666666666667 1.3333333333333333 0 1 1   3 1
                   78 1 0 0 0 0 0 0 0 0 0 0 0 6.4411764705882355                   0  .4117647058823529 1 1 0  69 0
                   89 0 0 0 0 0 0 0 0 0 0 0 1                  2                   0                  1 1 1 0   1 1
                   79 1 0 0 0 0 0 1 0 0 0 0 0                  3                   0                  1 1 1 1   1 1
                   73 0 1 0 0 0 0 0 0 0 0 0 0 3.8666666666666667                   0 .03333333333333333 1 1 1  32 0
                   48 0 0 0 0 0 0 0 0 0 0 0 1  4.615384615384615   6.107692307692307  7.523076923076923 1 1 1  65 1
                   84 1 0 1 0 0 0 0 0 0 0 0 1  3.761904761904762                   0 .14285714285714285 1 1 0  21 0
                   78 1 0 1 0 0 0 0 1 0 0 0 0 3.8048780487804876  .12195121951219512  .4146341463414634 1 1 0   1 1
                   87 1 0 0 0 0 0 0 1 0 0 0 0  2.933333333333333                   0                 .2 1 1 0  19 1
                   86 0 0 1 0 0 0 0 0 1 0 1 0 3.9298245614035086                   0  .3333333333333333 1 1 0  57 1
                   64 0 0 0 0 0 0 0 0 0 0 0 0            6.65625               3.375  .3958333333333333 1 1 0  96 0
                   85 1 1 0 0 0 0 0 0 0 0 0 0  3.814814814814815                   0  .4074074074074074 1 1 1  27 0
                   73 1 1 0 0 0 0 0 0 0 0 0 0               3.87                   0                 .1 1 1 0 100 0
                   81 1 1 0 0 0 0 0 1 0 0 0 0  5.666666666666667                   0                  1 1 1 0   6 0
                   77 1 0 0 0 0 0 0 0 0 0 0 0                  4                   0                  1 1 1 0   1 1
                   75 1 0 0 0 0 0 0 0 0 0 0 0                  5                   0  .3333333333333333 1 1 0   3 1
    77.87949381650849 0 0 0 0 0 0 0 0 0 0 0 0                  4                   0                  0 0 1 0   1 1
                   76 0 0 0 0 0 0 0 0 0 0 0 0                 .5                  .5                 .5 1 1 0   2 0
                   50 0 0 0 0 0 0 0 0 0 0 0 0 3.8135593220338984   .6440677966101694  .0847457627118644 0 1 0  73 0
                   94 1 0 0 0 0 0 0 1 0 0 0 0               6.12                   0                .12 1 1 0  26 0
                   76 1 1 0 0 0 0 0 0 0 0 0 1 5.2727272727272725                   0 .23636363636363636 1 1 1  13 1
                   92 1 0 1 0 0 0 0 0 0 0 0 0 3.0714285714285716  .29591836734693877 .20408163265306123 1 1 0  98 0
                   45 1 0 0 0 0 0 0 0 0 0 0 0  7.733333333333333   .8666666666666667  2.466666666666667 1 1 0  15 1
                   83 1 0 1 0 0 1 0 0 0 0 0 0  5.666666666666667                   0  .1111111111111111 1 1 1  12 1
                   86 1 1 0 0 0 0 0 1 0 0 0 1                  7                   0                  4 1 1 1   2 1
                   94 1 0 1 0 0 0 0 0 0 0 0 0                  5                   0                 .5 1 1 0   2 1
                   75 1 0 0 0 0 0 0 0 0 0 0 0               5.75                   0                  0 1 1 1   4 1
                   75 1 0 1 0 0 0 0 0 0 0 0 0  5.305555555555555                   0  .1388888888888889 1 1 0  25 1
                   88 1 0 1 0 0 0 0 0 0 0 0 0                  7                   7                 .5 1 1 0   3 1
                   89 1 0 0 0 0 0 0 0 0 0 0 1  5.819672131147541  .26229508196721313 .29508196721311475 1 1 1  61 0
                   85 1 0 0 0 0 0 0 0 0 0 0 0 5.6923076923076925                   0 .15384615384615385 1 1 0  27 0
    end
    Last edited by Kristian Henriksen; 23 Apr 2020, 11:51.

  • #2
    In latent class analysis, the standard practice is to keep adding latent classes until the BIC stops decreasing or the model starts failing to converge.

    FMM has a lot of similarities with latent class analysis. Therefore, I believe you would be justified in reporting exactly that if the models really fail to converge. You already seem to be fitting a survival model with a lot of covariates. There just might not be that much heterogeneity in the sample. Also, model identification might be a challenge: if you have 10 covariates, then every additional latent class you add requires an additional 10 parameters to estimate, plus one more intercept in the multinomial logit model for the latent class.

    However, take a few minutes to read through FMM example 2. It says that because the likelihood function for FMMs (and LCA) is multi-modal, conducting multiple random draws for the starting parameters can help ensure that the model converges at a global maximum rather than at a local maximum. I'd recommend adding this option:

    Code:
    fmm 3,  startvalues(randomid, draws(30) iterate(1000): streg $xlist , distribution($dist)
    If that doesn't converge, increase the maximum draws up to as many as 100.

    I am not 100% sure about this, but I think that the technique options and the difficult option may not affect estimation with categorical latent variables (most latent variables are continuous, e.g. random effects models and IRT models, and changing the estimation algorithm or using the difficult option should work there). So, don't bother. I don't think it's necessary to specify 1000 EM iterations; remember that the EM method is slower than the default optimizer. You might as well leave it at the default. Limiting the regular maximizer to 1000 iterations is smart. Otherwise if the model isn't converging, it will keep going for the default of 16,000 iterations, and that will take a long time.
    Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

    When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

    Comment


    • #3
      Hello Weiwen,

      Thank you very much for the answer.

      I have tried your suggestion and used up to 100 draws for the starting parameters. Unfortunately, models with 3 or more groups are still not converging.
      I am starting to believe, that it is happening because of a very homogeneous sample.

      As you have written, is it okay then to report, that we are not looking at models with 3 or more groups, since they are not converging? Probably because, the sample is not heterogenous.

      Could there be any other reason for the missing convergence in relation to the data and model?

      I have read on other forums, that a reason could be a poorly choosen model or small sample size. Both of which, should not be the case, since only univariate significant variables with economic intuitiv meaning were choosen, and the original sample size is 15755.

      Another reason for no convergence, I believe, could be, that some of the dummy variables have very small groups. Two of them have a =0 group with less than 5 % of the sample. After removing these, the models are still not converging, but there are 4 more dummy variables with less than 15 % of the sample in one of the groups.

      What do you think?

      Either way, if you know any litterature and references on the matter, especially something to reference that nonconvergence is a valid reason for model selection, please let me know.
      Last edited by Kristian Henriksen; 23 Apr 2020, 20:18.

      Comment


      • #4
        Kathryn Masyn wrote a chapter on latent class and profile models that was cited in the Stata SEM example on latent class models. On page 571, she says to start increasing the number of latent classes until the model is not well-identified.

        Here, you should note one point of divergence between Stata and some other software packages. Some packages I'm aware of (Mplus, the polca package in R, and the Penn State University LCA plugin for Stata) save the final log likelihoods of all the sets of randomly varied starting parameters. Stata doesn't do this (you can write a program to force it to, but this is tedious). What Stata does is that it randomly draws a set of starting parameters, then it allows the expectation maximization (EM) algorithm to run (up to 20 iterations unless you specify otherwise). Once the EM algorithm hits its maximum iterations, Stata draws another set of parameters and repeats. Once Stata has randomly drawn 100 (or however many you specified) random starts, it goes back to the highest log-likelihood and finishes the maximization with its usual maximization algorithm.

        In LCA, we are fitting models for just the mean of each of the indicator variables in each latent class. That is, I'm trying to find E(VAR_1), E(VAR_2), etc. With binary indicators, if one latent class has nearly 0 or nearly 1 on an indicator, that can often prevent convergence with Stata's usual convergence criteria, because the logit intercept goes to + or - infinity. In linear FMM, you're fitting a model for Y = XB + e in each latent class (or whatever the equivalent expression is in survival analysis). It could be that one parameter in one class is causing trouble in this fashion, since you noted that some of your dummy variables have relatively low prevalence. I'm not sure what a principled solution is, however. I would probably just say the models with 3 or more latent classes failed to converge. Maybe you could examine the model for 3 latent classes after 1,000 iterations and report the parameter estimates? I'm not sure if I'd have anything meaningful to offer, but it might be worth a look.
        Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

        When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

        Comment


        • #5
          I think, I end up reporting the nonconvergence. You explanation about the FMM seems valid with the nature of the dataset.

          Thank you once again. It has been a huge help and the reference is really usefull.

          Comment

          Working...
          X