P-value for difference in rates/counts

Sara Hansen

Join Date: Apr 2022
Posts: 30

P-value for difference in rates/counts

30 Jan 2023, 03:39

Dear all,

I have a dataset with the following categorical/binary variables: doctor (1, 2 or 3), disease (0/1), doctor_evaluation (0/1), type of disease (category 1-4) and time of disease (category 1-3).
I want to compare how many diseases the different doctors evaluate as diseases, and also compare on the subgroups, i.e. type of disease and time of disease. Doctor1 is my reference, i.e. I want to compare doctor2 and doctor3 to doctor1.
I have made the following descriptive statistics:

	Doc1	Doc2	Doc3
Disease (n=1000)	700	700	650
Disease subtype - 1 (n=200) - 2 (n=500) - 3 (n=300)	100 400 200	150 200 350	50 300 300
Time of disease - 0 (n=100) - 1 (n=900)	50 650	0 700	50 600

My question is: how do I calculate if there is a statistical difference between doctor1 and doctor2, and doctor1 and doctor3, respectively?

I started with McNemar’s test, but got a p<0.001 for a statistical difference between doctor1 disease (n=700) and doctor2 disease (n=700). This is because, that their distribution is very differently, i.e. it is not the same 700 disease-cases, they have found, and they also have very different distributions of evaluation=non-disease.
What I would expect is a p-value close to 1, since there obviously is no difference between the numbers 700 and 700.

So, I know that it is not a test for difference in proportions I need, but instead a difference between the actual numbers/rates.

I hope you can help,

Thank you for your time.

Tags: None

Sara Hansen

Join Date: Apr 2022

Posts: 30
#2

30 Jan 2023, 04:20

Ah, I see that a McNemars test indeed can be used. In my case, with the disease for doctor1 and doctor 2, I get the following contingency table:

500 200

200 100

So, with H0: The two doctors answers identify diseases correctly at the same rate (the contingency table is symmetric), I would get a p=1.0, and therefore reject H0, since the contingency table is symmetric.

So it is indeed the proportions I am interested in, and testing, but in my case the proportions are the same; doctor1 and doctor2 identifies the same number of diseases (and thus, misidentify the same number), i.e. their proportions are the same.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17714
#3

30 Jan 2023, 05:08

Sara:
I'd say that you're interested in the proportion of disease/non-disease each physician (correctly) identified as such (and the like):

Code:

logit disease i.subtype i.doctor i.time

I would check if -logit- (as per the aforementioned code) and related postestimation tests are what you're actually looking for.

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

P-value for difference in rates/counts

Comment

Comment