Hi all,
I’m looking for advice on how to quantify discordance between a continuous variable and its categorical classification.
For example, I have a continuous variable "height" and a categorical variable "heightclass" with three groups: SHA (shorter than average), AHA (average), and THA (taller than average). The categories are derived from an index taking other factors like age into account. The categories are meant to reflect ordinal differences (THA > AHA > SHA).
While the groups differ in height as expected, I want to focus on the exceptions. For instance, someone who is 163cm may be classified as SHA, while someone shorter, 158cm, is classified as AHA. I’d like to systematically measure how often this happens.
Is there a way in Stata to quantify these misclassifications/overlaps? For example, by comparing the observed ranking of height with the assigned heightclass and calculating the proportion of discordant cases?
I hope my question is clear, please feel free to ask questions for clarification.
I’m looking for advice on how to quantify discordance between a continuous variable and its categorical classification.
For example, I have a continuous variable "height" and a categorical variable "heightclass" with three groups: SHA (shorter than average), AHA (average), and THA (taller than average). The categories are derived from an index taking other factors like age into account. The categories are meant to reflect ordinal differences (THA > AHA > SHA).
While the groups differ in height as expected, I want to focus on the exceptions. For instance, someone who is 163cm may be classified as SHA, while someone shorter, 158cm, is classified as AHA. I’d like to systematically measure how often this happens.
Is there a way in Stata to quantify these misclassifications/overlaps? For example, by comparing the observed ranking of height with the assigned heightclass and calculating the proportion of discordant cases?
I hope my question is clear, please feel free to ask questions for clarification.
Comment