ICC in mi estimate meqrlogit

Insa Linnea

Join Date: Nov 2018

Posts: 9
#1

ICC in mi estimate meqrlogit

30 Nov 2018, 08:38

Hi all,

I have successfully imputed values and run the mi estimate meqrlogit for multilevel analysis, but now I am having troubles to investigate the variance.

To investigate the variance with the un-imputed data set I used the “estat ICC” command to get the Intraclass correlations, but this command does not work with the imputed data set. How can I assess the variance?

Can someone help me, please?
Tags: None
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#2

30 Nov 2018, 09:45

I've hand-calculated it in the past. Per the manual, the ICC for a two-level logit model is: (variance of the random effect) / [ (pi^2 / 3) + variance of random effect]

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Comment
Insa Linnea

Join Date: Nov 2018

Posts: 9
#3

30 Nov 2018, 10:53

Thank you! Just to make sure I've got it correctly. I got the variance of the random effect by typing the command "mi estimate, variance" which gave me the following table.

I now calculate: 0.11/ [ (pi^2 / 3) +0.11] = 0.032

Thank you so much for your help.
Comment

Weiwen Ng

Join Date: Jun 2015
Posts: 1241

30 Nov 2018, 11:37

Insa, that appears correct. To demonstrate outside of MI with a stock Stata dataset (irrelevant output omitted):

Code:

webuse towerlondon
meqrlogit dtlm difficulty i.group || family: || subject:

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
district: Identity           |
                  var(_cons) |   .2156188   .0733234      .1107202    .4199007
------------------------------------------------------------------------------

estat icc

------------------------------------------------------------------------------
                       Level |        ICC   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
                    district |   .0615089   .0196302      .0325591    .1131878
------------------------------------------------------------------------------

di c(pi)
3.1415927

di .2156188 / (c(pi) ^ 2 / 3 + .2156188)

.06150894

Note that c(pi) is a permanent scalar that stores the value of (unsurprisingly) pi. di is short for the command display. The manually calculated ICC is almost equal to the automatically estimated ICC, and the difference is due to rounding error.

This is probably not necessary in practice, but if you want to really impress your collaborators, then you can run this code:

Code:

mi estimate, post: meqrlogit ...
matrix list e(b)
matrix b = e(b)

The variance of the random intercept won't be directly visible in e(b), which is the matrix containing all the parameter estimates. You'll see a column with a strange heading like lns_1_1_1:cons. That is the log of the standard deviation of the variance component. Matrices are always addressed using row, then column. So, you'd count which column the variance component is in. In the above example, it's in the 7th column. You can then run:

Code:

matrix list e(b)
e(b)[1,7]
           eq1:        eq1:        eq1:        eq1:        eq1:        eq1:   lns1_1_1:
         urban         age      child1      child2      child3       _cons       _cons
y1   .73227642  -.02649815   1.1160015   1.3658951   1.3440312  -1.6892896  -.76712159

matrix b = e(b)

di exp(b[1,7])^2
.21561882

di exp(b[1,7])^2 / [c(pi)^2 / 3 + exp(b[1,7])^2]
.06150895

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

Comment

Insa Linnea

Join Date: Nov 2018

Posts: 9
#5

30 Nov 2018, 12:17

Thank you Weiwen, that really helps a lot. One more questions. How can I get the confidence interval for the ICC? The manual you mentioned gives me a formula but my results are negative.
Comment
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#6

30 Nov 2018, 12:51

Originally posted by Insa Linnea View Post

Thank you Weiwen, that really helps a lot. One more questions. How can I get the confidence interval for the ICC? The manual you mentioned gives me a formula but my results are negative.

I haven't seen anybody be that concerned about the confidence interval around the ICC. Nonetheless, the formula in the manual is the logit of the ICC's point estimate +/- 1.96 * the SE / [ICC * (1-ICC)]. You can obtain the variance of the log of the standard deviation of the variance component from the matrix e(V). In the example above, it's v[7,7] (you have to save e(V) as a different matrix to access its components). From there, it isn't simple algebra to get the SE. I checked.

If anybody is that concerned about getting a confidence interval around the ICC, then show them this thread and ask them to do the math themselves.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Comment
daniel klein

Join Date: Mar 2014

Posts: 3842
#7

30 Nov 2018, 14:21

I would not be so sure that computing ICC from the combined results is necessarily the most appropriate way, especially when you are also interested in statistical inference. Perhaps there is a good reason why estat icc does not work with mi. Perhaps it would be more consistent with the mi framework to compute ICC in each imputed dataset, then combine the results using Rubin's rules? Maybe you would then want to apply some sort of transformation, given that ICC does probably not follow a normal distribution, which is required for combining results from multiply imputed data.

Just some thoughts ...

Best
Daniel
Comment
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#8

30 Nov 2018, 19:18

Originally posted by daniel klein View Post

I would not be so sure that computing ICC from the combined results is necessarily the most appropriate way, especially when you are also interested in statistical inference. Perhaps there is a good reason why estat icc does not work with mi. Perhaps it would be more consistent with the mi framework to compute ICC in each imputed dataset, then combine the results using Rubin's rules? Maybe you would then want to apply some sort of transformation, given that ICC does probably not follow a normal distribution, which is required for combining results from multiply imputed data.

Just some thoughts ...

Best
Daniel

Daniel, this is a good point.

My rationale, which I should have stated, is this. The variance components in mixed models are already estimated under Rubin's rules. The ICC is calculated directly from the variance components. Hence, I was thinking that one can just calculate the ICC from the MI-based point estimate of the variance components.

What's your opinion on that?

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Comment
daniel klein

Join Date: Mar 2014

Posts: 3842
#9

01 Dec 2018, 03:07

Weiwen

The way I see it, MI is essentially about estimating population parameters (i.e., point estimates) of interest and obtaining valid their standard errors for statistical inference. From this perspective, we should ask whether the ICC qualifies as a population parameter. I believe this is the case when the ICC is used, e.g., as an estimate of inter-rater reliability. There might be other examples. If the ICC is considered a point estimate in this sense, I believe it should be estimated just like other point estimates in the MI framework: estimate the ICC in each dataset, optionally apply a transformation to make its distribution more symmetric, optionally estimate its standard error, then finally combine the results according to Rubin's rules.

On the other hand, the ICC could probably be considered a descriptive statistic of the estimated model so that there is no underlying population parameter involved. In this case, you could probably simply estimate ICC from the (combined) variances in the final model. However, I wonder what the standard error or CI of the ICC would mean here.

Sticking with the idea of a descriptive statistic of the model, we could go further and consider the between and within variances separately. You could probably estimate an ICC for both if this seems interesting.

The bottom line is that I do not believe there is a straightforward answer here. The ICC seems to represent different things in different (theoretical) frameworks and the most appropriate way to estimate it might really depend on the underlying research questions that we wish to answer.

Best
Daniel

Last edited by daniel klein; 01 Dec 2018, 03:10.
Comment

Announcement