xtmelogit questions-imputed data set, results of groups and diagnostics

Andrew Hudacs

Join Date: Apr 2016

Posts: 4
#1

xtmelogit questions-imputed data set, results of groups and diagnostics

03 May 2016, 14:44

Hello Stata Forum,

Thank you in advance for reading and contemplating this forum entry.
I am building a cross-classified multilevel logistic model with random effects on a 5000 case data set with 20 imputations to measure college completion for students from different towns and high schools using Stata 14.1.

Here is an example of the syntax or commands:

mi estimate, or: xtmelogit Completion Female FRL SAT FRL_TIITownRural SIISES SIIRigor TIITownVillage TIITownRural || _all: R.School || Town:

Description of priority variables: “Completion” is an indicator of college completion with dichotomous values; “School” and “Town” are identifying codes for the groups of schools and towns for each student; all of the level two variable titles begin with “SII” for school variables and “TII” for town variables; FRL_TIITownRural is an interaction between a student variable (dichotomous and at level 1) and a group variable (dichotomous and at level 2).

When started building the model, I used the syntax “_all: R.School || _all: R.Town” for level two groups, but encountered errors such as,
“Hessian is not negative semidefinite model did not converge on m=1 r(430)”
or
“could not calculate numerical derivatives -- discontinuous region with missing values encountered
could not calculate numerical derivatives -- discontinuous region with missing values encountered
model did not converge on m=1 r(430)” ,

depending on the level one variables added.

In an effort to be more computationally efficient, (according to the Stata Multi-level Mixed-Effects Manual 13, Rabe-Hesketh & Skrondal (2012), and entries in the Stata Forum,), I changed the syntax to “_all: R.School || Town:”. The level 2 group “School” has about 1/3 the number of clusters as “Towns”. The syntax yields results on a data set with 2 imputations, but not on a data set with 20 imputations. Depending on the variables added to the model, the analysis may run for dozens of hours without converging.
Questions:
1.Is there something additional I need to specify with the imputed data sets before running the xtmelogit command?

2.When I do receive results, “Schools” are represented as one group. ----------------------------------------------------------------------------
| No. of Observations per Group Integration
Group Variable | Groups Minimum Average Maximum Points
----------------+-----------------------------------------------------------
_all | 1 5,000 5,000.0 5,000 1
Town | 200 1 25.2 147 1
----------------------------------------------------------------------------

Are the random effects parameter estimates still interpretable as crossed effects for both School and Town, or only for one of the groups?

3.As I add more variables to the model, the standard error for Town increases substantially. I am not sure if this is due to estimation problems or an incorrectly specified covariance structure. Is there a way I can use the “covariance(unstructured)” option? I currently get “not allowed” r(198) error.

Your advice or guidance is appreciated.

Thank you,
Andrew

Rabe-Hesketh, S. & Skrondal, A. (2012). Multilevel and longitudinal modeling using Stata: Volume II: Categorical responses, counts, and survival 3^rd edition. Stata Press: College Station, Texas
StataCorp. (2013). Stata multilevel mixed effects reference manual release 13. College Station, TX: StataCorp LP.

Last edited by Andrew Hudacs; 03 May 2016, 14:47. Reason: Edited to improve formatting.
Tags: None
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#2

03 May 2016, 19:42

Welcome to Statalist, Andrew. Your post is very difficult to read, with a jumble of fonts, font sizes, font faces, and tables columns that do not match up. Read FAQ 12 , which directs posters to copy all commands, listings, and results between CODE delimiters: opening [C O D E], closing [/C O D E], but with spaces removed. Use just one font and font size for everthing else.

Last edited by Steve Samuels; 03 May 2016, 19:48.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment
Andrew Hudacs

Join Date: Apr 2016

Posts: 4
#3

04 May 2016, 10:04

Steve-Thank you very much for explaining the formatting dilemma and your understanding of me as a new forum member.

Here is another attempt at a legible post-
I am building a cross-classified multilevel logistic model with random effects on a 5,000 case data set with 20 imputations to measure college completion for students of different towns and schools using Stata 14.1.
Here is an example of a command:

Code:

mi estimate, or: xtmelogit Completion Female FRL SAT FRL_TIITownRural SIISES SIIRigor TIITownVillage TIITownRural || _all: R.School || Town:

Here is a brief description of priority variables: "Completion" is an indicator of college completion with dichotomous/binary values; "School" and "Town" are identifier variables for the groups of schools and towns for each student; all of the level two variable titles begin with "SII" for school group variables and "TII" for town group variables; FRL_TIITownRural is an interaction between a student/individual variable (FRL, dichotomous or binary at level 1) and a Town group variable (TIITownRural, dichotomous or binary at level 2).
I started the model building process using the level 2 commands,

Code:

_all: R.School || _all: R.Town

but as the model became more complex I encountered errors such as,

Code:

Hessian is not negative semidefinite model did not converge on m=1 r(430)

or

Code:

could not calculate numerical derivatives -- discontinuous region with missing values encountered could not calculate numerical derivatives -- discontinuous region with missing values encountered model did not converge on m=1 r (430),

In an effort to be more computationally efficient, (according to the Stata Multi-level Mixed Effects Manual 13, Rabe-Hesketh & Skrondal (2012), & posts on the Stata Forum-references below), I changed the level 2 group command to

Code:

_all: R.School || Town

The level 2 group "School" has about 1/3 the number of clusters as "Towns". This command yields results on a data set with 2 imputations, but not on a data set with 20 imputations. Depending on the variables added to the model, the analysis may run for hours or days without converging.
Questions:
Is there something additional I need to specify with the imputed data sets before running the

Code:

xtmelogit

command?

When I do receive results using the command

Code:

_all: R.School || Town

, "Schools" are represented as one group. Are the random effects parameter estimates still interpretable as crossed effects for both "School" and Town" or only for one of the groups? Here is an example of results:

Code:

Group Variable No.of Groups Observation Minimum Observation Average Observation Maximum Integration Points

_all 1 5,000 5,000.0 5,000 1

Town 200 1 25.2 147 1

As I add more variables to the model, the standard error for "Town" increases substantially. I am not sure if this is due to estimation problems or an incorrectly specified covariance structure. In my efforts to diagnose the issue, I receive the error

Code:

not allowed r(198)

when I use the option

Code:

covariance(unstructured)

.

Your advice or guidance is appreciated
Thank you,
Andrew

Rabe-Hesketh, S. & Skrondal, A. (2012). Multilevel and longitudinal modeling using Stata: Volume I: Continuous responses. 3rd edition. Stata Press: College Station, TX
StataCorp. (2013). Stata multilevel mixed effects reference manual release 13. College Station, TX: StataCorp LP.
Comment
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#4

04 May 2016, 15:11

To nswer your questions:

1. I don't thank that you need anything else but mi estimate: before xtmelogit. However the Stata 14 Manual lists only meqrlogit as a multilevel logit commands taking the mi prefix; I don't know about Stata 13. You will lneed additional mi prefixes for post-estimation commands like predict and margins.

2. Schools must be nested in towns, yes? (If not, explain). As such schools cannot be crossed with towns. According to Example 11 on page 397 of the Stata 14 Manual (link below), where states are nested in regions, the efficient random effects specification is

Code:

|| town: R.school

3. I don't think that unstructured correlations are plausible in a strictly nested model, but exchangeable correlations are.

Reference:
www.stata.com/manuals14/me.pdf

Last edited by Steve Samuels; 04 May 2016, 15:23.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment
Andrew Hudacs

Join Date: Apr 2016

Posts: 4
#5

05 May 2016, 12:04

Thank you very much for the suggestions and link to the updated manual.
The analysis is cross-classified because students from the same town can attend different schools, but students are only members of one town and one school (no multiple memberships).

I ran comparative models with the commands xtmelogit and meqrlogit and so far have found identical results with similar models using level 1 predictor variables.

Follow up questions:
Are the xtmelogit and meqrlogit commands synonymous (but of different versions) or do they command different analyses?
In the previous post you suggested exchangeable correlations. I couldn't find it in the manual. Did you mean exchangeable covariance? I tried this option and received the following error:

Code:

option cov(exchangeable) not allowed r(198)

Thanks in advance.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30119
#6

05 May 2016, 13:06

Just kind of butting in here, but I don't think you've answered Steve's questions and concerns.

OK, students from the same town can attend different schools. But that doesn't make them cross-classified, in particular, it doesn't exclude the possibility that schools are nested in towns. Can students at the same school come from different towns? If not, schools are nested in towns and the top level structure should look like -|| town: || school:-

-xtmelogit- is the old name for -melogit-. That and -meqrlogit- use different numerical algorithms for estimation (and, when both converge, give strongly similar but not always exactly identical results)--but they estimate the same models.

You don't show us the full code that led to the " option cov(exchangeable) not allowed" error message; you just say that you tried the option. But perhaps you put it in the wrong place or specified it incorrectly in some way. So it is best to show the complete exact command that generated the error message, copied directly from your Results window or log file and pasted into a code block here without any editing.
Comment
Andrew Hudacs

Join Date: Apr 2016

Posts: 4
#7

05 May 2016, 14:28

Hi Clyde,
Thanks for jumping in.

In regards to cross-classified: Yes, students can be from different towns and attend the same school, students at the same school can be from different towns, and students from the same town can attend different schools. Not every town has a school and some towns have students that choose which school to attend. For the most part, students from the same town attend the same school. However, there is a small percentage of cases where students from the same town do not attend the same school. There are less schools than towns, so there is essentially an imperfect hierarchy or nesting of towns within schools. Multiple membership possibilities are eliminated because all students live in only one town and attend only one school.

Thank you for clarifying the meqrlogit question.

In regards to the placement of cov(exchangeable), I may have found the proper placement on the command.
Example:

Code:

mi estimate, or: meqrlogit Completion Female FRL SAT || _all: R.School, cov(exchangeable) || Town:

Thanks again.
Comment

Group Variable	No.of Groups	Observation Minimum	Observation Average	Observation Maximum	Integration Points
_all	1	5,000	5,000.0	5,000	1
Town	200	1	25.2	147	1

Announcement

xtmelogit questions-imputed data set, results of groups and diagnostics

Comment

Comment

Comment

Comment

Comment

Comment