Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • mixed not concave

    In order to replicate the problem of not concave, the data is uploaded:https://www.dropbox.com/s/k45ns53rte...ample.dta?dl=0
    Here is the code, how to deal with this problem?


    Code:
    mixed y cl.x##c.w i.a b  c d   || id:cl.x, vce(robust) cov(exc)

  • #2
    I am not an expert in mixed models. With that said, I am concerned when I notice that your variables a, b, c, and d are constant within each id.
    Code:
    . tabstat a b c d, by(id) statistics(range)
    
    Summary statistics: range
      by categories of: id (group(name))
    
          id |         a         b         c         d
    ---------+----------------------------------------
           1 |         0         0         0         0
           2 |         0         0         0         0
           3 |         0         0         0         0
    ...
    
          74 |         0         0         0         0
          75 |         0         0         0         0
          76 |         0         0         0         0
    ---------+----------------------------------------
       Total |         8  5.992714  10.30899         1
    --------------------------------------------------
    That leads me to first try the model
    Code:
    mixed y cl.x##c.w || id:cl.x, vce(robust) cov(exc) difficult
    which converges, then four more versions, separately adding i.a, b, c, and d, all of which converge. But when I include i.a and either b or c, the model fails to converge.

    So a common diagnostic approach in these circumstances is to limit the number of iterations and inspect the results. For the model with i.a and b we have the following results, in which I have highlighted in red a result that perhaps another member with more experience in mixed models can explain the significance of.
    Code:
    . mixed y cl.x##c.w i.a b || id:cl.x, vce(robust) cov(exc) iterate(20)
    
    Performing EM optimization: 
    
    Performing gradient-based optimization: 
    
    Iteration 0:   log pseudolikelihood = -243.52449  
    Iteration 1:   log pseudolikelihood = -242.24485  
    Iteration 2:   log pseudolikelihood = -242.24323  
    Iteration 3:   log pseudolikelihood = -242.24323  (not concave)
    Iteration 4:   log pseudolikelihood = -242.24323  (not concave)
    ...
    Iteration 19:  log pseudolikelihood = -242.24323  (not concave)
    Iteration 20:  log pseudolikelihood = -242.24323  (not concave)
    convergence not achieved
    
    Computing standard errors:
    
    Mixed-effects regression                        Number of obs     =        474
    Group variable: id                              Number of groups  =         76
    
                                                    Obs per group:
                                                                  min =          3
                                                                  avg =        6.2
                                                                  max =          7
    
                                                    Wald chi2(11)     =      48.30
    Log pseudolikelihood = -242.24323               Prob > chi2       =     0.0000
    
                                        (Std. Err. adjusted for 76 clusters in id)
    ------------------------------------------------------------------------------
                 |               Robust
               y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
               x |
             L1. |   .1016916   .2058044     0.49   0.621    -.3016775    .5050607
                 |
               w |   .1518304   .1564095     0.97   0.332    -.1547266    .4583873
                 |
        cL.x#c.w |   -.047828   .0558256    -0.86   0.392    -.1572443    .0615882
                 |
               a |
              4  |   .4307988   .1445108     2.98   0.003     .1475628    .7140347
              5  |   .6453963   .1293368     4.99   0.000     .3919009    .8988917
              6  |   .3326278   .1975582     1.68   0.092    -.0545791    .7198347
              7  |   .1581225   .1651106     0.96   0.338    -.1654883    .4817332
              8  |   .3042626   .0824021     3.69   0.000     .1427574    .4657678
              9  |   .1846456   .0963684     1.92   0.055    -.0042329    .3735242
             11  |    .256985   .1038483     2.47   0.013      .053446     .460524
                 |
               b |   .0582941   .0413075     1.41   0.158    -.0226671    .1392553
           _cons |   3.357153   .6035209     5.56   0.000     2.174274    4.540032
    ------------------------------------------------------------------------------
    
    ------------------------------------------------------------------------------
                                 |               Robust           
      Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
    -----------------------------+------------------------------------------------
    id: Exchangeable             |
                  var(L.x _cons) |   .0100354   .5708113      3.85e-51    2.61e+46
                  cov(L.x,_cons) |   .0100354   .5708113     -1.108734    1.128805
    -----------------------------+------------------------------------------------
                   var(Residual) |    .118391   .1891595      .0051679    2.712236
    ------------------------------------------------------------------------------
    
    Warning: convergence not achieved

    Comment


    • #3
      Thanks William, does anyone could explain more?
      @Clyde Schechter @Weiwen Ng @Joseph Coveney
      Last edited by Fred Lee; 01 Jun 2019, 21:35.

      Comment


      • #4
        @Nick Cox could you please help me? Thanks a lot!

        Comment


        • #5
          Fred: I notice your requests to particular Forum participants. You should perhaps read and digest advice given in the thread at https://www.statalist.org/forums/for...ivate-messages

          Comment


          • #6
            Originally posted by Stephen Jenkins View Post
            Fred: I notice your requests to particular Forum participants. You should perhaps read and digest advice given in the thread at https://www.statalist.org/forums/for...ivate-messages
            Thanks for your remind.

            Comment


            • #7
              Run the following experiment.
              Code:
              regress y b i.id
              regress y i.id b
              The order of the variables makes a difference when Stata chooses which variables to eliminate because of collinearity.

              The first regression tells us
              Code:
              . regress y b i.id
              note: 76.id omitted because of collinearity
              The second regression tells us
              Code:
              . regress y i.id b
              note: b omitted because of collinearity
              That is, the variable b is perfectly predicted by id. Not only is b constant within each id, as I noted in post #2, each id has a different value of b. The variable b adds no information to that which is already given by id.

              The complicated formulation of the mixed model hid this fact from you, but the results I highlighted in post #2 are often indicative of problems due to collinearity.

              Your model is flawed by including both b and id. You must omit one or the other.

              If you omit b the model converges easily.
              Code:
              . mixed y cl.x##c.w i.a || id:cl.x, vce(robust) cov(exc)
              
              Performing EM optimization: 
              
              Performing gradient-based optimization: 
              
              Iteration 0:   log pseudolikelihood = -245.06941  
              Iteration 1:   log pseudolikelihood = -243.81957  
              Iteration 2:   log pseudolikelihood = -243.81643  
              Iteration 3:   log pseudolikelihood = -243.81643  
              
              Computing standard errors:
              
              Mixed-effects regression                        Number of obs     =        474
              Group variable: id                              Number of groups  =         76
              
                                                              Obs per group:
                                                                            min =          3
                                                                            avg =        6.2
                                                                            max =          7
              
                                                              Wald chi2(10)     =      35.21
              Log pseudolikelihood = -243.81643               Prob > chi2       =     0.0001
              
                                                  (Std. Err. adjusted for 76 clusters in id)
              ------------------------------------------------------------------------------
                           |               Robust
                         y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
              -------------+----------------------------------------------------------------
                         x |
                       L1. |   .0922254   .2007274     0.46   0.646     -.301193    .4856439
                           |
                         w |   .1581071   .1494684     1.06   0.290    -.1348456    .4510597
                           |
                  cL.x#c.w |  -.0464169   .0544704    -0.85   0.394     -.153177    .0603432
                           |
                         a |
                        4  |   .4534787   .1415018     3.20   0.001     .1761402    .7308171
                        5  |   .6441936   .1514849     4.25   0.000     .3472888    .9410985
                        6  |   .3644797   .2040079     1.79   0.074    -.0353684    .7643277
                        7  |   .1954667   .1864038     1.05   0.294    -.1698781    .5608115
                        8  |   .3388006   .1145218     2.96   0.003     .1143419    .5632592
                        9  |    .195889   .1068757     1.83   0.067    -.0135835    .4053615
                       11  |   .2596739    .101415     2.56   0.010     .0609041    .4584436
                           |
                     _cons |   3.486475   .5657096     6.16   0.000     2.377704    4.595245
              ------------------------------------------------------------------------------
              
              ------------------------------------------------------------------------------
                                           |               Robust           
                Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
              -----------------------------+------------------------------------------------
              id: Exchangeable             |
                            var(L.x _cons) |   .0105453   .0033627      .0056445    .0197011
                            cov(L.x,_cons) |   .0105453   .0033627      .0039545    .0171361
              -----------------------------+------------------------------------------------
                             var(Residual) |   .1183675   .0315033      .0702565    .1994244
              ------------------------------------------------------------------------------
              Last edited by William Lisowski; 02 Jun 2019, 09:29.

              Comment


              • #8
                Originally posted by William Lisowski View Post

                Your model is flawed by including both b and id. You must omit one or the other.

                If you omit b the model converges easily.
                Thanks William! however I find if drop one of b,c,d, the model will concave, so which should I drop?

                Comment


                • #9
                  I would not include any variable which is perfectly predicted by the id variable. But then, as I wrote in post #2, I am not an expert in mixed models. Perhps the methodology appropriately allows the user to make something out of nothing.

                  Comment


                  • #10
                    Originally posted by William Lisowski View Post
                    I would not include any variable which is perfectly predicted by the id variable. But then, as I wrote in post #2, I am not an expert in mixed models. Perhps the methodology appropriately allows the user to make something out of nothing.
                    Thanks William, I use mixed to run the hierarchical linear model, the variables of level 2 are all the same for each individuals in levle1. For example, the level 1 are the characteristics of students, and the level 2 are the characteristics of school.

                    Comment


                    • #11
                      I use mixed to run the hierarchical linear model, the variables of level 2 are all the same for each individuals in levle1
                      Reading in your data, we see that you have declared it as panel data, with id as your panel variable and seq as your time variable.
                      Code:
                      . xtset
                             panel variable:  id (strongly balanced)
                              time variable:  serial, 1 to 8
                                      delta:  1 unit
                      
                      . xtdescribe
                      
                            id:  1, 2, ..., 76                                     n =         76
                        serial:  1, 2, ..., 8                                      T =          8
                                 Delta(serial) = 1 unit
                                 Span(serial)  = 8 periods
                                 (id*serial uniquely identifies each observation)
                      ...
                      Here I dsplay the first two panels of your data.
                      Code:
                      . list id serial y x w a b c d if id<=2, sepby(id) noobs
                      
                        +--------------------------------------------------------------------+
                        | id   serial           y     x   w    a           b           c   d |
                        |--------------------------------------------------------------------|
                        |  1        1   4.9444444   2.2   4   11   2.3978953   3.8918203   1 |
                        |  1        2   4.2222222     2   4   11   2.3978953   3.8918203   1 |
                        |  1        3   4.9444444     2   4   11   2.3978953   3.8918203   1 |
                        |  1        4   4.3333333     2   4   11   2.3978953   3.8918203   1 |
                        |  1        5   4.7222222     2   4   11   2.3978953   3.8918203   1 |
                        |  1        6           4     2   4   11   2.3978953   3.8918203   1 |
                        |  1        7           4   2.4   4   11   2.3978953   3.8918203   1 |
                        |  1        8           4     2   4   11   2.3978953   3.8918203   1 |
                        |--------------------------------------------------------------------|
                        |  2        1   4.1111111   3.8   3    6   3.0445224   6.8035053   1 |
                        |  2        2   4.1666667   2.8   3    6   3.0445224   6.8035053   1 |
                        |  2        3   4.5555556   1.4   3    6   3.0445224   6.8035053   1 |
                        |  2        4   4.7777778     2   3    6   3.0445224   6.8035053   1 |
                        |  2        5           4     1   3    6   3.0445224   6.8035053   1 |
                        |  2        6   4.7777778   2.4   3    6   3.0445224   6.8035053   1 |
                        |  2        7           5   1.8   3    6   3.0445224   6.8035053   1 |
                        |  2        8           5   1.4   3    6   3.0445224   6.8035053   1 |
                        +--------------------------------------------------------------------+
                      We see in these two panels what is the case for all 76: the only variables that change with time are y and x. All others are constant for the panel.

                      In the context of your mixed model, level 2 is the id (panel) and level 1 is seq (time). It is true as you say that the variable you have at level 2 (id) is the same for each observation within level 1 (seq).

                      But there are also variables in your model - w a b c and d - that you included at level 1 but are in fact are the same for each observation within level 2. This is equivalent to including the school size as a characteristic of each of the students in that school when it is in fact a characteristic of the school. School size is a characteristic of the school (level 2), not the student (level 1). Your variables w a b c and d are characteristics of id (level 2), not of seq (level 1).

                      That is not my understanding of how hierarchical models work.

                      Comment

                      Working...
                      X