Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • P-value for difference in rates/counts

    Dear all,

    I have a dataset with the following categorical/binary variables: doctor (1, 2 or 3), disease (0/1), doctor_evaluation (0/1), type of disease (category 1-4) and time of disease (category 1-3).
    I want to compare how many diseases the different doctors evaluate as diseases, and also compare on the subgroups, i.e. type of disease and time of disease. Doctor1 is my reference, i.e. I want to compare doctor2 and doctor3 to doctor1.
    I have made the following descriptive statistics:
    Doc1 Doc2 Doc3
    Disease (n=1000) 700 700 650
    Disease subtype
    - 1 (n=200)
    - 2 (n=500)
    - 3 (n=300)
    100
    400
    200
    150
    200
    350
    50
    300
    300
    Time of disease
    - 0 (n=100)
    - 1 (n=900)
    50
    650
    0
    700
    50
    600

    My question is: how do I calculate if there is a statistical difference between doctor1 and doctor2, and doctor1 and doctor3, respectively?

    I started with McNemar’s test, but got a p<0.001 for a statistical difference between doctor1 disease (n=700) and doctor2 disease (n=700). This is because, that their distribution is very differently, i.e. it is not the same 700 disease-cases, they have found, and they also have very different distributions of evaluation=non-disease.
    What I would expect is a p-value close to 1, since there obviously is no difference between the numbers 700 and 700.

    So, I know that it is not a test for difference in proportions I need, but instead a difference between the actual numbers/rates.



    I hope you can help,

    Thank you for your time.

  • #2
    Ah, I see that a McNemars test indeed can be used. In my case, with the disease for doctor1 and doctor 2, I get the following contingency table:
    500 200
    200 100

    So, with H0: The two doctors answers identify diseases correctly at the same rate (the contingency table is symmetric), I would get a p=1.0, and therefore reject H0, since the contingency table is symmetric.

    So it is indeed the proportions I am interested in, and testing, but in my case the proportions are the same; doctor1 and doctor2 identifies the same number of diseases (and thus, misidentify the same number), i.e. their proportions are the same.

    Comment


    • #3
      Sara:
      I'd say that you're interested in the proportion of disease/non-disease each physician (correctly) identified as such (and the like):
      Code:
      logit disease i.subtype i.doctor i.time
      I would check if -logit- (as per the aforementioned code) and related postestimation tests are what you're actually looking for.
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment

      Working...
      X