Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Kappa values in STATA

    Dear Statalist community,

    I've been a long-time reader and for the first time saw the need to create a profile.

    I'm relatively new to the Kappa statistic in STATA. I've read the relevant documentation and practiced with it, and I hope you can help me with the following.
    Please note: I've simplified the variable names and values to keep the explanation of the problem as simple as possible. I use the car metaphor since this is a commonly used database in STATA:

    For a study we asked 101 participants whether characteristics of a car would make ik less likely (score = -1) or more likely (score = +1) for them to buy a car.
    If the characteristic had no influence on their decision to buy this car, they were assigned the score '0'.

    The data looks like this:

    v1 = observer 1 and this continues all the way to v101.

    Now when I run .kap v1 - v101 the outcome that I receive is:


    My first question:
    The way I interpret this is that these 101 observers have 1.3% agreement in what factors they consider to make it less likely to buy a car, 23% regarding factors that they don't care about when purchasing, and 27% when it comes to factors that they think make a car more attractive to buy (all low agreement).
    However, looking at all the factors together, they seem to agree in 24% of the cases.
    Am I interpreting this correct? I don't know why but this just seems confusing to me.

    My second question:
    In order to see if I understood this analysis, I dropped all variables and just look at the car being red.
    When I run the .kap v1 - v101 analysis I get:

    However, when I tried 3 consecutive other variables, I get the exact same values.
    How so?

    Thank you in advance for your time and thanks for being such a great community.

    Best,
    Levent

  • #2
    Levent, please note that examples are better shown using dataex (from SSC) and CODE delimiters (as below) than by using screen shots. Review the FAQ again as it gives lots of helpful advice for posting good questions.

    Concerning your first question it is not entirely clear from your description if your understanding is correct. The individual kappa values are obtained by comparing one category against all others. Here is a demonstration

    Code:
    . webuse p615b
    
    . kap rater1-rater5
    
    There are 5 raters per subject:
    
             Outcome |    Kappa          Z     Prob>Z
    -----------------+-------------------------------
                   1 |    0.2917       2.92    0.0018
                   2 |    0.6711       6.71    0.0000
                   3 |    0.3490       3.49    0.0002
    -----------------+-------------------------------
            combined |    0.4179       5.83    0.0000
    
    . 
    . // replicate first individual kappa
    . recode rater1-rater5 (2 3 = 0) , prefix(a_)
    (2 differences between rater1 and a_rater1)
    (5 differences between rater2 and a_rater2)
    (6 differences between rater3 and a_rater3)
    (8 differences between rater4 and a_rater4)
    (9 differences between rater5 and a_rater5)
    
    . kap a_rater1-a_rater5
    
    There are 5 raters per subject:
    
    Two-outcomes, multiple raters:
    
             Kappa        Z        Prob>Z
            -----------------------------
            0.2917       2.92      0.0018
    Your second question is hard to understand. You say you dropped all variables, but the variables (columns) represent the observers ratings, so if you dropped them all there is no rating to base the calculations on. What you probably meant to say is that you have dropped all observations (rows) that indicate something other than a red car. In this case the kappa values is based on only one subject. While mathematically it is possible to compute this, it probably makes little sense from a theoretical point of view. Given only one subjects the observers cannot demonstrate their ability to differentiate between subjects. We can therefore not know whether the observers agree (or disagree) because of something related to this one subject or because of some unmeasured factor associated with the observers. One might even argue that kap should not give any output in this situation.

    I cannot judge whether kappa is an appropriate statistic for answering your research based on your description. If it is, you may want to have a look at kappaetc (SSC) for alternative/additional related statistics.

    Best
    Daniel

    Comment


    • #3
      Hi Daniel,

      Thank you for your reply.

      I'll make sure to use the appropriate format in the future instead of the screenshots, I must admit that I probably haven't spent enough time in the FAQ-section here.

      I realize that the categories are compared to each other, for each category individually. My apologies for the misunderstanding, I did mean that I dropped the observations and not the variables.
      Based on your answer, I start to question if I'm getting this right though.

      Why would the statistic not be able to provide a value for this one subject? In practice, can 101 people not have a % of agreement on if the car's color contributes to their decision to buy it?
      If not, then what does the kappa mean in my example with all the rows intact? Does that mean that there's 24% agreement on the influence of the red color, the car being a station wagon, the car being a truck, or the car being blue, on the likeliness to purchase this car?

      I'll look into the alternative/additional statistics. All I wanted is to provide a % of agreement for the influence of each characteristic (red, truck, etc) on the decision to buy the car or not.
      I imagined a table with each of these characteristics, the Kappa, and the P-value, but perhaps it's not the right test indeed.

      Thanks again,
      Levent

      Comment


      • #4
        Why would the statistic not be able to provide a value for this one subject? In practice, can 101 people not have a % of agreement on if the car's color contributes to their decision to buy it?
        They could. But you would have no empirical evidence to support the claim that the agreement has actually something to do with the car's color (or anything else related to the car). Maybe the people would answer the same way regardless of which car or color or whatever they are asked to judge. Since you did not observe their rating of (at least one) other (and different) cars, you cannot say anything about this. This is reflected by the low kappa value. Note that this is only one perspective, not necessarily the one correct answer.

        [...] what does the kappa mean in my example with all the rows intact? Does that mean that there's 24% agreement on the influence of the red color, the car being a station wagon, the car being a truck, or the car being blue, on the likeliness to purchase this car?
        Usually the observers rate the subjects with respect to attributes of these subjects. In your situation the raters instead make a statement about their preferences, not necessarily about the subjects - at least not about a theoretically objective aspect of the subjects. I guess you could still use kappa or related statistics to asses agreement, but you need to carefully think about what agreement means. Probably your interpretation is in the right direction.

        Best
        Daniel
        Last edited by daniel klein; 01 Feb 2017, 14:58. Reason: deleted a misleading example

        Comment


        • #5
          Thank you again for your thoughts Daniel.

          I'll try to read more into it over the weekend, experiment with it next week, and I hope to come back with either an answer so others can learn from this, or with a more refined question.

          Thank you,
          Levent

          Comment

          Working...
          X