mixed miscounts ID variables in a nested random effects model.

paulvonhippel

Join Date: Apr 2014

Posts: 504
#1

mixed miscounts ID variables in a nested random effects model.

18 Aug 2023, 05:44

Hi all. I'm attaching a dataset which has up up to 6 reading test scores (R_THETA) nested in each of 5995 children (CHILDID) nested each of 446 schools (S2_ID). Yet the output from the mixed procedure reports that both CHILDID and S2_ID have 5931 groups, as though every child is in their own school.

The output doesn't make sense in other ways, either. Convergence is not achieved, and the child and school variances are reported as practically equal. (In fact they wouldn't be distinguishable if every child was in their own school.)

What is going on here? Is there a bug in mixed, or am I doing something wrong?

I'm attaching the data and log. Here are the commands I'm running.

Code:

log using mixed_error, replace use reduced, clear /* This dataset has up to 6 reading test scores (R_THETA) nested in each of 5995 children (CHILDID) nested each of 446 schools (S2_ID) */ ssc install distinct distinct CHILDID S2_ID /* CHILDID has 5995 disinct values, S2_ID has 446 */ list if missing(S2_ID) | missing(CHILDID) /* Neither has any missing values */ mixed R_THETA || CHILDID: || S2_ID:, iter(5) /* but mixed reports that both CHILDID and S2_ID has 5931 groups, as though every child is in their own school */ /* The CHILDID and S2_ID variances are reported as practically equal, and convergence was not achieved (even if I let it run for more than 5 iterations) In fact the variances would not be distinguishable if every child was in their own school. */ /* Note that I used the iter(5) options because it doesn't converge (not concave). */ log close

Attached Files

mixed_error.smcl (4.6 KB, 1 view)

reduced.dta (662.9 KB, 1 view)

Last edited by paulvonhippel; 18 Aug 2023, 05:56.
Tags: None
Andrew Musau

Join Date: Oct 2014

Posts: 10298
#2

18 Aug 2023, 07:45

Deleted

Last edited by Andrew Musau; 18 Aug 2023, 07:49.
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10298

18 Aug 2023, 09:02

This is just a difficult maximization. Newton-Raphson fails quite often with "badly-behaved" integrands. Below, adaptive quadrature appears to do the trick (after some rescaling of your variable - you can use raw values), but you need to be patient.

Convergence is not achieved, and the child and school variances are reported as practically equal.

If convergence is not achieved, do not read anything into the results.

Code:

ssc install gllamm, replace
use "reduced.dta", clear
drop if missing(R_THETA )
replace R_THETA= int( R_THETA*1000)
gllamm R_THETA, i(CHILDID S2_ID)

Res.:

Code:

. gllamm R_THETA, i( CHILDID S2_ID ) adapt

Running adaptive quadrature
Iteration 0:    log likelihood = -336208.14
Iteration 1:    log likelihood = -335554.92
Iteration 2:    log likelihood = -335321.64
Iteration 3:    log likelihood = -335141.64
Iteration 4:    log likelihood = -335118.27
Iteration 5:    log likelihood = -335118.07


Adaptive quadrature has converged, running Newton-Raphson
Iteration 0:   log likelihood = -335118.07  (not concave)
Iteration 1:   log likelihood = -335118.07  (not concave)
Iteration 2:   log likelihood = -335030.54  
Iteration 3:   log likelihood = -335026.66  
Iteration 4:   log likelihood = -335026.51  
Iteration 5:   log likelihood = -335026.51  
 
number of level 1 units = 40638
number of level 2 units = 5931
number of level 3 units = 444
 
Condition Number = 3410.5986
 
gllamm model 
 
log likelihood = -335026.51
 
------------------------------------------------------------------------------
     R_THETA | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
       _cons |    276.155   12.41826    22.24   0.000     251.8157    300.4944
------------------------------------------------------------------------------
 
Variance at level 1
------------------------------------------------------------------------------

  765837.49 (5873.7663)
 
Variances and covariances of random effects
------------------------------------------------------------------------------

 
***level 2 (CHILDID)
 
    var(1): 89764.834 (4262.5619)
 
***level 3 (S2_ID)
 
    var(1): 92618.48 (6900.0865)
------------------------------------------------------------------------------

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30192
#4

18 Aug 2023, 09:48

Code:

mixed R_THETA || CHILDID: || S2_ID:, iter(5)

is a model that has schools nested within children. O.P. wants it the other way around. It should be:

Code:

mixed R_THETA || S2_ID: || CHILDID:
Comment
paulvonhippel

Join Date: Apr 2014

Posts: 504
#5

18 Aug 2023, 10:33

I don't think it's a question of model specification or maximization method. Let me clarify the problem that I meant to highlight. The -mixed- procedure seems to think that the number of CHILDIDs and S2_IDs is equal -- in fact, that the CHILDIDs and S2_IDs are the same. Here is the key part of the output from -mixed-. The statistics for CHILDID and S2_ID are identical. They should not be.

-------------------------------------------------------------
| No. of Observations per Group
Group Variable | Groups Minimum Average Maximum
----------------+--------------------------------------------
CHILDID | 5,931 1 6.9 8
S2_ID | 5,931 1 6.9 8
-------------------------------------------------------------

If in fact the CHILDID and S2_ID were the same, then it would be impossible to distinguish the child-level and school-level variance. I think that's why the model is not converging.

But in fact, the CHILDID and S2_ID are not the same. For example, if you type distinct CHILDID S2_ID, you get this:

| Observations
| total distinct
---------+----------------------
CHILDID | 47960 5995
S2_ID | 47960 446

Why isn't mixed getting the count of the S2_ID right? I think this must be the source of the problem.

Last edited by paulvonhippel; 18 Aug 2023, 10:39.
Comment
paulvonhippel

Join Date: Apr 2014

Posts: 504
#6

18 Aug 2023, 10:49

Wait, hold it, I think Clyde Schechter has nailed it. I just needed to reverse the order of CHILDID and S2_ID in the -mixed- statement. Then the command converges quickly....
Comment

Announcement

mixed miscounts ID variables in a nested random effects model.

Comment

Comment

Comment

Comment

Comment