Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculating ICC using ordinal variable and for one measurement and multiple subjects case

    Hello Statalist users,

    I am trying to calculate the intercluster correlation (ICC) for my dataset but I seem to run into two obstacles. First, if I can calculate ICC using an ordinal variable; second: what command to use for calculating ICC given I have one measurement on multiple subjects rather than multiple measurements of one subject.

    First some background: In my data, I have variables asking about health behaviours of individuals, but they do not have their individual id or class id variable in the data (for anonymity). What I do have is a school identifier. So, for example, I have about 30-100 children for each school, and there about 40 schools in my dataset. I am trying to calculate the ICC using one of the available health behaviour variables, which are all ordinal variables.

    The way the data is collected, for each school class there is a data collector who collects this data, so I have a scenario of one data collector (can also be thought of as one data measurement) collecting data for several children. I should also say that there are more than 1 data collectors collecting data in our case, but each child will only face one data collector who collects data just once. This is different from what seems to be the scenario when trying to use the Stata <icc> command, which assumes you have multiple measurements (often called judges in the online Stata examples) and one individual (often called targets in the examples). And then the model is selected depending on how these judges are drawn.

    Question 1: can I the <icc> command when I have only one judge but multiple targets?

    Note: I did try calculating the ICC using the command <loneway>, which seemed to work well (when trying it with a continuous variable), but arriving to my question 2, does this command also work correctly if using an ordinal scale variable?

    Question 2: I read online that calculating a ICC with an ordinal variable negatively biases the ICC toward zero and a recent paper doing a simulation model suggested it is possible (https://arxiv.org/pdf/2211.01170v2.pdf) . So I was wondering, does anyone know in practice if it is possible to calculate an ICC when using an ordinal scale variable, and if so, what the Stata command or code to do such a calculation would be?

    Thank you very much in advance for any suggestions or insights you may have.

    Zoey

    p.s. this is my first Statalist post, so please let me know if there is any information I did not provide but should have. Thanks.

  • #2
    Some additional details that might explain my situation better. My team and I are designing an experiment (RCT) with children in schools. We have a small dataset of nutritional outcomes of children in schools from another location/time moment but within the same population. We are trying to use this dataset to get a sense for how correlated nutritional outcomes (e.g. fruit and vegetable intake) are among children in the same school, such that we can get a better sense of the sample size we will need to have sufficient power.

    Comment


    • #3
      From your description, I assume that your cluster is school. Is that what you used for loneway?

      Anyway, you can get an intraclass correlation coefficient (ICC) on ordered-categorical responses probably most easily by fitting a random effects ordered-probit regression model to them. I show an example below of the basic approach, assuming that school is your clustering variable of interest. Begin at the "Begin here" comment; the top part is just to create a toy dataset for illustration. In the example, pupils are not identified, and the school ID is sid.

      .ÿ
      .ÿversionÿ17.0

      .ÿ
      .ÿclearÿ*

      .ÿ
      .ÿ//ÿseedem
      .ÿsetÿseedÿ535089684

      .ÿ
      .ÿ//ÿSchools
      .ÿquietlyÿsetÿobsÿ40

      .ÿgenerateÿbyteÿsidÿ=ÿ_n

      .ÿgenerateÿdoubleÿsid_uÿ=ÿrnormal()

      .ÿ
      .ÿ//ÿPupilsÿ(noÿIDs)
      .ÿquietlyÿexpandÿruniformint(30,ÿ100)

      .ÿ
      .ÿ//ÿOrdered-categoricalÿoutcomeÿmeasurements
      .ÿgenerateÿdoubleÿlatÿ=ÿrnormal(sid_u,ÿ1)ÿ//ÿICCÿ=ÿ0.5

      .ÿegenÿbyteÿoutÿ=ÿcut(lat),ÿgroup(5)ÿ//ÿFiveÿlevels

      .ÿ
      .ÿ*
      .ÿ*ÿBeginÿhere
      .ÿ*
      .ÿmeoprobitÿoutÿ||ÿsid:ÿ,ÿnolog

      Mixed-effectsÿoprobitÿregressionÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿÿ=ÿÿÿÿÿÿ2,546
      Groupÿvariable:ÿsidÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿgroupsÿÿ=ÿÿÿÿÿÿÿÿÿ40

      ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿObsÿperÿgroup:
      ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿminÿ=ÿÿÿÿÿÿÿÿÿ31
      ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿavgÿ=ÿÿÿÿÿÿÿ63.6
      ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿmaxÿ=ÿÿÿÿÿÿÿÿÿ92

      Integrationÿmethod:ÿmvaghermiteÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿIntegrationÿpts.ÿÿ=ÿÿÿÿÿÿÿÿÿÿ7

      ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿchi2()ÿÿÿÿÿÿÿÿÿÿÿ=ÿÿÿÿÿÿÿÿÿÿ.
      Logÿlikelihoodÿ=ÿ-3355.0166ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿProbÿ>ÿchi2ÿÿÿÿÿÿÿ=ÿÿÿÿÿÿÿÿÿÿ.
      ------------------------------------------------------------------------------
      ÿÿÿÿÿÿÿÿÿoutÿ|ÿCoefficientÿÿStd.ÿerr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿconf.ÿinterval]
      -------------+----------------------------------------------------------------
      ÿÿÿÿÿÿÿ/cut1ÿ|ÿÿ-1.202417ÿÿÿ.1719711ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ-1.539474ÿÿÿ-.8653594
      ÿÿÿÿÿÿÿ/cut2ÿ|ÿÿ-.3099973ÿÿÿÿ.170832ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ-.6448219ÿÿÿÿ.0248274
      ÿÿÿÿÿÿÿ/cut3ÿ|ÿÿÿ.4443522ÿÿÿ.1708686ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ.1094559ÿÿÿÿ.7792485
      ÿÿÿÿÿÿÿ/cut4ÿ|ÿÿÿ1.273486ÿÿÿ.1718917ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ.9365845ÿÿÿÿ1.610388
      -------------+----------------------------------------------------------------
      sidÿÿÿÿÿÿÿÿÿÿ|
      ÿÿÿvar(_cons)|ÿÿÿ1.129716ÿÿÿ.2603299ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ.7191507ÿÿÿÿ1.774675
      ------------------------------------------------------------------------------
      LRÿtestÿvs.ÿoprobitÿmodel:ÿchibar2(01)ÿ=ÿ1485.22ÿÿÿÿÿÿProbÿ>=ÿchibar2ÿ=ÿ0.0000

      .ÿ
      .ÿ//ÿICCÿisÿgivenÿbelow
      .ÿestatÿicc

      Residualÿintraclassÿcorrelation

      ------------------------------------------------------------------------------
      ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿLevelÿ|ÿÿÿÿÿÿÿÿICCÿÿÿStd.ÿerr.ÿÿÿÿÿ[95%ÿconf.ÿinterval]
      -----------------------------+------------------------------------------------
      ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿsidÿ|ÿÿÿ.5304539ÿÿÿ.0573959ÿÿÿÿÿÿ.4183174ÿÿÿÿ.6395974
      ------------------------------------------------------------------------------

      .ÿ
      .ÿexit

      endÿofÿdo-file


      .


      The estimated ICC doesn't seem particularly biased low, but if you have a concern, then you can probably get a handle on how significant it is under your circumstances via simulation.

      Comment

      Working...
      X