Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Error from STATA while doing lrtest

    Hi Everyone,

    I ran into an error saying "df(unrestricted) = df(restricted) = 16" while trying to compare a full and reduced model (in order to look for confounding). I am running a melogit model and the variable I am trying to remove has no missing data.

    Anyone else seen this message before and know how to deal with it?

    Thanks

  • #2
    The message means exactly what it says: the two models you applied -lrtest- to have the same degrees of freedom, and therefore are not nested. Either you failed to removce the variable of interest when you ran the restricted model, or somehow you inadvertently added a new variable in its place, or somehow changed something else in the model that added a degree of freedom. It is pointless to try to speculate. If you show the exact commands you ran and the exact output from Stata (by copy/pasting from the Results window or your log file into the Forum editor between code delimiters), somebody will probably be able to spot the source of the problem. (If you are not familiar with code delimiters, see FAQ #12 for details on how to use them.)

    Comment


    • #3
      Q. Could the error described in #1 be due to the variance of the random slopes being 0 in the second model?

      I ask, because I just got the same error message, and the reason appears to be that the variance of the random slopes = 0 in the second model. Here is my code:

      Code:
      * M1: Random intercept model
      melogit q21yes time if esample, eform || pt_id:, binomial(q21count)
      estimates store m1
      * M2: Add random slope for Time
      set iterlog off // Without this, 75 iterations are listed
      melogit q21yes time if esample, eform || pt_id: time, binomial(q21count)
      estimates store m2
      lrtest m1 m2
      And here is the output:

      Code:
      Mixed-effects logistic regression               Number of obs     =      3,418
      Binomial variable:     q21count
      Group variable:           pt_id                 Number of groups  =        583
      
                                                      Obs per group:
                                                                    min =          3
                                                                    avg =        5.9
                                                                    max =          7
      
      Integration method: mvaghermite                 Integration pts.  =          7
      
                                                      Wald chi2(1)      =      61.25
      Log likelihood = -5980.7516                     Prob > chi2       =     0.0000
      ------------------------------------------------------------------------------
            q21yes |     exp(b)   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
              time |      0.953      0.006   -7.826   0.0000        0.942       0.965
             _cons |      0.573      0.017  -19.181   0.0000        0.541       0.606
      -------------+----------------------------------------------------------------
      pt_id        |
         var(_cons)|      0.292      0.022                         0.252       0.338
      ------------------------------------------------------------------------------
      Note: Estimates are transformed only in the first equation.
      LR test vs. logistic model: chibar2(01) = 1466.37     Prob >= chibar2 = 0.0000
      
      . estimates store m1
      
      . * M2: Add random slope for Time
      . set iterlog off // Without this, 75 iterations are listed
      
      . melogit q21yes time if esample, eform || pt_id: time, binomial(q21count)
      
      Mixed-effects logistic regression               Number of obs     =      3,418
      Binomial variable:     q21count
      Group variable:           pt_id                 Number of groups  =        583
      
                                                      Obs per group:
                                                                    min =          3
                                                                    avg =        5.9
                                                                    max =          7
      
      Integration method: mvaghermite                 Integration pts.  =          7
      
                                                      Wald chi2(1)      =      61.25
      Log likelihood = -5980.7516                     Prob > chi2       =     0.0000
      ------------------------------------------------------------------------------
            q21yes |     exp(b)   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
              time |      0.953      0.006   -7.826   0.0000        0.942       0.965
             _cons |      0.573      0.017  -19.181   0.0000        0.541       0.606
      -------------+----------------------------------------------------------------
      pt_id        |
          var(time)|      0.000      0.000                             .           .
         var(_cons)|      0.292      0.022                         0.252       0.338
      ------------------------------------------------------------------------------
      Note: Estimates are transformed only in the first equation.
      LR test vs. logistic model: chibar2(01) = 1466.37     Prob >= chibar2 = 0.0000
      
      . estimates store m2
      
      . lrtest m1 m2
      df(unrestricted) = df(restricted) = 3
      r(498);
      
      end of do-file
      
      r(498);
      Notice that the estimates from the 2nd model duplicate those from the 1st model--i.e., the same coefficients, the same variance of the random intercepts, etc.

      If I use -mixed- in place of -melogit-, on the other hand, the variance of the random slopes > 0 and -lrtest- works, ruling out the accidental inclusion or exclusion of a variable that Clyde speculated about in #2.

      Code:
      * Q. What if I estimated both models via -mixed- rather than -melogit-?
      mixed q21yes time if esample || pt_id:
      estimates store mixed1
      * M2: Add random slope for Time
      mixed q21yes time if esample || pt_id: time, cov(un)
      estimates store mixed2
      lrtest mixed1 mixed2
      Code:
      . mixed q21yes time if esample || pt_id:
      
      Performing EM optimization:
      
      Performing gradient-based optimization:
      
      Iteration 0:   log likelihood = -5339.8696  
      Iteration 1:   log likelihood = -5339.8696  
      
      Computing standard errors:
      
      Mixed-effects ML regression                     Number of obs     =      3,418
      Group variable: pt_id                           Number of groups  =        583
      
                                                      Obs per group:
                                                                    min =          3
                                                                    avg =        5.9
                                                                    max =          7
      
                                                      Wald chi2(1)      =     192.75
      Log likelihood = -5339.8696                     Prob > chi2       =     0.0000
      
      ------------------------------------------------------------------------------
            q21yes |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
              time |     -0.125      0.009  -13.883   0.0000       -0.142      -0.107
             _cons |      4.470      0.067   66.527   0.0000        4.339       4.602
      ------------------------------------------------------------------------------
      
      ------------------------------------------------------------------------------
        Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
      -----------------------------+------------------------------------------------
      pt_id: Identity              |
                        var(_cons) |      2.206      0.138         1.952       2.493
      -----------------------------+------------------------------------------------
                     var(Residual) |      0.826      0.022         0.784       0.870
      ------------------------------------------------------------------------------
      LR test vs. linear model: chibar2(01) = 2786.23       Prob >= chibar2 = 0.0000
      
      . estimates store mixed1
      
      . * M2: Add random slope for Time
      . mixed q21yes time if esample || pt_id: time, cov(un)
      
      Performing EM optimization:
      
      Performing gradient-based optimization:
      
      Iteration 0:   log likelihood = -5277.9013  
      Iteration 1:   log likelihood = -5277.9009  
      
      Computing standard errors:
      
      Mixed-effects ML regression                     Number of obs     =      3,418
      Group variable: pt_id                           Number of groups  =        583
      
                                                      Obs per group:
                                                                    min =          3
                                                                    avg =        5.9
                                                                    max =          7
      
                                                      Wald chi2(1)      =     106.52
      Log likelihood = -5277.9009                     Prob > chi2       =     0.0000
      
      ------------------------------------------------------------------------------
            q21yes |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
              time |     -0.120      0.012  -10.321   0.0000       -0.143      -0.097
             _cons |      4.464      0.068   65.267   0.0000        4.330       4.598
      ------------------------------------------------------------------------------
      
      ------------------------------------------------------------------------------
        Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
      -----------------------------+------------------------------------------------
      pt_id: Unstructured          |
                         var(time) |      0.037      0.005         0.028       0.047
                        var(_cons) |      2.362      0.160         2.068       2.698
                   cov(time,_cons) |     -0.072      0.021        -0.113      -0.032
      -----------------------------+------------------------------------------------
                     var(Residual) |      0.693      0.021         0.654       0.735
      ------------------------------------------------------------------------------
      LR test vs. linear model: chi2(3) = 2910.17               Prob > chi2 = 0.0000
      
      Note: LR test is conservative and provided only for reference.
      
      . estimates store mixed2
      
      . lrtest mixed1 mixed2
      
      Likelihood-ratio test                                 LR chi2(2)  =    123.94
      (Assumption: mixed1 nested in mixed2)                 Prob > chi2 =    0.0000
      
      Note: The reported degrees of freedom assumes the null hypothesis is not on
            the boundary of the parameter space.  If this is not true, then the
            reported test is conservative.

      --
      Bruce Weaver
      Email: [email protected]
      Version: Stata/MP 18.5 (Windows)

      Comment


      • #4
        Yes, if the added variable (in this case the random slope variance component) turns out to have zero effect in the model, then it does not add a df and you can get the same error message from -lrtest-. In fact, as you noticed, the results of the two logit-based models are entirely identical. It's a somewhat exotic situation, but it does happen.

        Comment


        • #5
          Thanks for confirming my suspicion, Clyde.
          --
          Bruce Weaver
          Email: [email protected]
          Version: Stata/MP 18.5 (Windows)

          Comment


          • #6
            Hello,

            I have a challenge like Chelsea Course, I have tried to use the explanation given by Clyde Schechter but nothing positive. Showing full output is challenging from my side.


            Please see below:


            stepwise, pe(0.2) lockterm1:ologit Lenth_of_service (i.sex i.cddAge i.CommunityRelationshipCATALL resilience Personality_Trait) i.EducationCATALL i.OccupationCATALL i.Volunteer_Selection2 i.Gift_to_CDD ib3.Common_ProblemsCAT3 ib2.Performance2 ib3.Supervision2 i.Cost_to_CDD if cdd_country==1,or

            est store A1

            ***Now I drop i.EducationCATALL

            stepwise, pe(0.2) lockterm1:ologit Lenth_of_service (i.sex i.cddAge i.CommunityRelationshipCATALL resilience Personality_Trait) i.OccupationCATALL i.Volunteer_Selection2 i.Gift_to_CDD ib3.Common_ProblemsCAT3 ib2.Performance2 ib3.Supervision2 i.Cost_to_CDD if cdd_country==1,or

            est store B1

            lrtest B1 A1

            ***** This is the error I get

            df(unrestricted) = df(restricted) = 11

            Strangely this same approach works when some other variables are dropped but not for education

            Please help.





            Comment


            • #7
              Since you are using stepwise there is no guarantee that the models are nested. This is a requirement for lrtest. So you have to choose: either lrtest or stepwise, but you cannot do both.
              ---------------------------------
              Maarten L. Buis
              University of Konstanz
              Department of history and sociology
              box 40
              78457 Konstanz
              Germany
              http://www.maartenbuis.nl
              ---------------------------------

              Comment


              • #8
                @ Maarten Buis, Thanks so much. I had a consultation section with one of our biostatistics consultants and the Professor gave the same reason. I have been able to figure it out.

                Comment


                • #9
                  Hello, I encountered the similar problem where "df(unrestricted) = df(restricted) = 2" but the results of the two models were different. Is it correct to interpret the results as indicating that the classroom level variance is not significant, and therefore, the model without the classroom level (model 1) should be selected? Thank you.
                  Model1 output:
                  Code:
                  . melogit c_negative_b || Child_ID:
                  
                  Fitting fixed-effects model:
                  
                  Iteration 0:   log likelihood = -1503.7259  
                  Iteration 1:   log likelihood = -1503.0722  
                  Iteration 2:   log likelihood =  -1503.072  
                  Iteration 3:   log likelihood =  -1503.072  
                  
                  Refining starting values:
                  
                  Grid node 0:   log likelihood = -1525.5275
                  
                  Fitting full model:
                  
                  Iteration 0:   log likelihood = -1525.5275  
                  Iteration 1:   log likelihood = -1503.4376  
                  Iteration 2:   log likelihood = -1500.2564  
                  Iteration 3:   log likelihood = -1500.2165  
                  Iteration 4:   log likelihood = -1500.2164  
                  
                  Mixed-effects logistic regression               Number of obs     =      3,660
                  Group variable:        Child_ID                 Number of groups  =        610
                  
                                                                  Obs per group:
                                                                                min =          6
                                                                                avg =        6.0
                                                                                max =          6
                  
                  Integration method: mvaghermite                 Integration pts.  =          7
                  
                                                                  Wald chi2(0)      =          .
                  Log likelihood = -1500.2164                     Prob > chi2       =          .
                  ------------------------------------------------------------------------------
                  c_negative_b |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                  -------------+----------------------------------------------------------------
                         _cons |  -1.864403   .0616863   -30.22   0.000    -1.985306   -1.743501
                  -------------+----------------------------------------------------------------
                  Child_ID     |
                     var(_cons)|   .2162702    .101452                      .0862382    .5423678
                  ------------------------------------------------------------------------------
                  LR test vs. logistic model: chibar2(01) = 5.71        Prob >= chibar2 = 0.0084
                  
                  . estimates store model1

                  Model2 output:
                  Code:
                  . melogit c_negative_b || Class_ID: || Child_ID:
                  
                  Fitting fixed-effects model:
                  
                  Iteration 0:   log likelihood = -1503.7259  
                  Iteration 1:   log likelihood = -1503.0722  
                  Iteration 2:   log likelihood =  -1503.072  
                  Iteration 3:   log likelihood =  -1503.072  
                  
                  Refining starting values:
                  
                  Grid node 0:   log likelihood = -1533.8179
                  
                  Fitting full model:
                  
                  Iteration 0:   log likelihood = -1533.8179  (not concave)
                  Iteration 1:   log likelihood = -1517.7318  (not concave)
                  Iteration 2:   log likelihood =    -1490.6  
                  Iteration 3:   log likelihood = -1488.3781  
                  Iteration 4:   log likelihood = -1488.2153  
                  Iteration 5:   log likelihood = -1488.1259  
                  Iteration 6:   log likelihood = -1488.1189  
                  Iteration 7:   log likelihood = -1488.1156  (backed up)
                  Iteration 8:   log likelihood = -1488.1148  (backed up)
                  Iteration 9:   log likelihood = -1488.1146  (backed up)
                  Iteration 10:  log likelihood = -1488.1146  (backed up)
                  Iteration 11:  log likelihood = -1488.1146  (backed up)
                  Iteration 12:  log likelihood = -1488.1146  (backed up)
                  Iteration 13:  log likelihood = -1488.1146  (backed up)
                  Iteration 14:  log likelihood = -1488.1146  (backed up)
                  Iteration 15:  log likelihood = -1488.1146  (backed up)
                  Iteration 16:  log likelihood = -1488.1146  (backed up)
                  Iteration 17:  log likelihood = -1488.1146  (backed up)
                  Iteration 18:  log likelihood = -1488.1146  (backed up)
                  Iteration 19:  log likelihood = -1488.1146  (backed up)
                  Iteration 20:  log likelihood = -1488.1146  (backed up)
                  Iteration 21:  log likelihood = -1488.1146  (backed up)
                  Iteration 22:  log likelihood = -1488.1146  (not concave)
                  Iteration 23:  log likelihood = -1488.1146  (not concave)
                  Iteration 24:  log likelihood = -1488.1146  (backed up)
                  Iteration 25:  log likelihood = -1488.1146  (not concave)
                  Iteration 26:  log likelihood = -1488.1146  (not concave)
                  Iteration 27:  log likelihood = -1488.1146  (backed up)
                  Iteration 28:  log likelihood = -1488.1146  (backed up)
                  Iteration 29:  log likelihood = -1488.1146  (backed up)
                  Iteration 30:  log likelihood = -1488.1146  (not concave)
                  Iteration 31:  log likelihood = -1488.1146  (not concave)
                  Iteration 32:  log likelihood = -1488.1146  (not concave)
                  Iteration 33:  log likelihood = -1488.1146  (backed up)
                  Iteration 34:  log likelihood = -1488.1146  (not concave)
                  Iteration 35:  log likelihood = -1488.1146  (not concave)
                  Iteration 36:  log likelihood = -1488.1146  (backed up)
                  Iteration 37:  log likelihood = -1488.1146  (not concave)
                  Iteration 38:  log likelihood = -1488.1146  (not concave)
                  Iteration 39:  log likelihood = -1488.1146  (not concave)
                  Iteration 40:  log likelihood = -1488.1146  (not concave)
                  Iteration 41:  log likelihood = -1488.1146  (not concave)
                  Iteration 42:  log likelihood = -1488.1146  (not concave)
                  Iteration 43:  log likelihood = -1488.1146  (not concave)
                  Iteration 44:  log likelihood = -1488.1146  (not concave)
                  Iteration 45:  log likelihood = -1488.1146  
                  Iteration 46:  log likelihood =  -1488.114  (not concave)
                  Iteration 47:  log likelihood = -1488.1139  (not concave)
                  Iteration 48:  log likelihood = -1488.1137  (not concave)
                  Iteration 49:  log likelihood = -1488.1136  
                  Iteration 50:  log likelihood = -1488.1051  (not concave)
                  Iteration 51:  log likelihood = -1488.1051  (not concave)
                  Iteration 52:  log likelihood = -1488.1041  (not concave)
                  Iteration 53:  log likelihood = -1488.1026  (not concave)
                  Iteration 54:  log likelihood = -1488.0496  
                  Iteration 55:  log likelihood = -1488.0488  (not concave)
                  Iteration 56:  log likelihood = -1488.0485  (not concave)
                  Iteration 57:  log likelihood = -1488.0484  (not concave)
                  Iteration 58:  log likelihood = -1488.0483  
                  Iteration 59:  log likelihood = -1488.0429  
                  Iteration 60:  log likelihood = -1488.0429  (not concave)
                  Iteration 61:  log likelihood = -1488.0429  (backed up)
                  
                  Mixed-effects logistic regression               Number of obs     =      3,660
                  
                  -------------------------------------------------------------
                                  |     No. of       Observations per Group
                   Group Variable |     Groups    Minimum    Average    Maximum
                  ----------------+--------------------------------------------
                         Class_ID |         94         12       38.9         48
                         Child_ID |        610          6        6.0          6
                  -------------------------------------------------------------
                  
                  Integration method: mvaghermite                 Integration pts.  =          7
                  
                                                                  Wald chi2(0)      =          .
                  Log likelihood = -1488.0429                     Prob > chi2       =          .
                  -----------------------------------------------------------------------------------
                       c_negative_b |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                  ------------------+----------------------------------------------------------------
                              _cons |  -1.863686   .0708677   -26.30   0.000    -2.002584   -1.724788
                  ------------------+----------------------------------------------------------------
                  Class_ID          |
                          var(_cons)|   .2150221   .0646483                      .1192785    .3876183
                  ------------------+----------------------------------------------------------------
                  Class_ID>Child_ID |
                          var(_cons)|   3.17e-33   2.02e-17                             .           .
                  -----------------------------------------------------------------------------------
                  LR test vs. logistic model: chibar2(01) = 30.06       Prob >= chibar2 = 0.0000
                  
                  . estimates store model2

                  lr test and output:
                  Code:
                  . lrtest model1 model2
                  df(unrestricted) = df(restricted) = 2
                  r(498);
                  
                  end of do-file
                  Last edited by Ryan Luo; 27 Aug 2024, 20:51.

                  Comment


                  • #10
                    Ryan, notice how the estimated variance for the child specific effect is basically zero. Is this longitudinal data?

                    Comment


                    • #11
                      Hi Jeff, yes, it is longitudinal data. The data has three levels: classroom, children, and time. Model 1 includes two levels: children and time; Model 2 includes all three levels. I wanted to run an LR test to determine if the classroom level is needed, but the test failed to run, so I'm not sure if the classroom level is significant or should be added to the model.

                      Comment


                      • #12
                        For whatever reason, the variance for children is estimated to be basically zero. That means the two models are not really nested. The more general model is collapsing to the simpler version, which is why it's reporting two degrees-of-freedom in each case. You need to find out why the child-specific variance estimate is zero.

                        Comment


                        • #13
                          Are students tested multiple times within the same classroom? Or, do you have one test per grade? If it's the latter, the problem is that children are not truly nested within class. If it's the former, then I don't know why that's happening.

                          Comment


                          • #14
                            Thank you Jeff. I see, so the low the variance for children means time level and the child level are not really nested. It's the former, students were tested multiple times within the same classroom.

                            Comment

                            Working...
                            X