Assuming ICC

Anika Zaman

Join Date: Aug 2022

Posts: 3
#1

Assuming ICC

25 Aug 2022, 01:09

Hello everyone, This is my first post here. I need icc for power calculation. I have outcome variables test scores and student's effort to study .Can you hint me any source to assume icc ? I have been told to look at similar study so I was checking data of a similar study. I am trying to find out the intraclass correlation between testscores and then student's effort separately, not between them. I have the test score, student ids and they are clustered by village id.I also have 2 treatment groups and 1 control. I have tried the stata command but I can't understand who is my rater/target? Not sure whether to pick one way or two way.It is a Randomised Controlled Trial model.for power calculation I should be using : power twomeans 0, cluster k1(75) k2(75) m1(15) m2(15) power(.8 .9) rho() I need to assume icc/rho.Thanks.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30358
#2

25 Aug 2022, 12:22

Yes, this is a dilemma that is often faced when designing a trial with multi-level data. I can tell you from long and painful experience with this that, despite doing your best, you likely will end up with a useless power analysis. Going to the literature to find studies using similar variables is, frankly, a fool's errand. First, very few studies actually publish the ICC they achieved. Second, even if you find one that did, unless they are studying the same outcome and explanatory variables you are and model it the same way, the ICC they found may have little or nothing to do with the ICC you will attain in your study. Finally, my experience is that even if you can find a published study that is an exact match, ICC's generally seem to vary wildly in replications of the same study, even when based on large samples. Consequently, I would never take seriously any power/sample-size/effect-size analysis based on a single assumed value of ICC. I think you need to assume a range of values that seems plausible (in light of whatever past studies you can find plus any other information you might know about the constructs involved) and present multiple corresponding power/sample-size/effect-size analyses. So you can end up saying things like "our expected case scenario will provide us with a minimum detectable effect size of xxx, based on an ICC of yyy. Under unfavorable circumstances, with ICC = zzz, the minimum detectable effect size will be www; and under an optimistic scenario with ICC = vvv, the minimum detectable effect size decreases to uuu."

Probably for your expected case scenario you can do something with the data you currently have.* For example:

Code:

mixed test_score i.treatment_group || village_id: estat icc

and use that result as your base case. Then allow a generous margin around that for optimistic and pessimistic scenarios. Similarly for the effort measurement.

*Note: I'm assuming here that the data you are working with is some kind of preliminary data, not the actual complete study data. If the data you have is the complete data from your study, it is generally understood that it is not appropriate to do a post-hoc power analysis based on that data.
1 like
Comment
Anika Zaman

Join Date: Aug 2022

Posts: 3
#3

18 Sep 2022, 20:13

Thanks a lot for your kind help. I have noted your suggestions.
Comment
Adrien Bouguen

Join Date: Jul 2014

Posts: 88
#4

14 Nov 2025, 12:43

Hi,

I have a similar need to calculate ICC. I have been using loneway for years but loneway does not allow to condition for stratification variables that are commun when designing an experiment. I have tried to use mixed model as Clyde suggested but I was surprised to see slightly different results from mixed and loneway commands.

When I compare results from (using Anika's variables) :

Code:

mixed test_score || village_id: estat icc

and from

Code:

loneway test_score village_id

I get slightly different results. Do you know why and which one is more appropriate for a power calculation? Thanks
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30358
#5

14 Nov 2025, 14:07

If you write a model equation for both commands, they appear to be the same: test_score = grand_mean + u_{village_id} + e_{village_id, repeat_factor}.

But the equation hides a major difference in approach to the model fit. With -loneway- you are fitting a fixed-effects model. That means that u_{village_id} is estimated by least squares separately for each village id, and the distribution of those u's might be anything at all. With -mixed-, a random effects model is fitted. The distribution of the u_{village_id}'s is assumed to be normal, with zero mean, and maximum likelihood is used to estimate its variance. There is no attempt to actually estimate the individual u's in this model. So the results of -loneway- and -mixed- will usually differ at least a little. In principle, they could even differ greatly if the actual u-distribution is very far from normal.

As for power calculation, you should first settle on the analysis that you are going to use for your hypothesis test. Once you have decided on that, the power calculations should be done using the same model as that underlying the chosen analysis.
1 like
Comment

Announcement

Comment

Comment

Comment

Comment