Dear Daniel,
Many thanks for the swift and helpfull reply.
Sorry for the inconvience.
Best regards,
/Christian
Many thanks for the swift and helpfull reply.
Sorry for the inconvience.
Best regards,
/Christian
kap newph1_nausea_d newph1_nausea, wgt(w) kappaetc newph1_nausea_d newph1_nausea, wgt(w)
// We need example data
*clear // <- uncomment will clear data in memory
webuse rate2 // <- example data
// -- Run -kappaetc-
kappaetc rada radb , wgt(power 8)
// Above, we use ridiculous weights to get high coefficients
// -- Now the benchmarks
kappaetc , benchmark largesample
/*
Above, we specify -largesample- so -kappaetc- uses the standard normal.
This is what K. Gwet suggests to do. I found it a bit inconsistent to
use the t-distribution for confidence intervals (with default
standard errors) but then switch to the standard normal for
benchmarking. Therefore, -kappetc- would normally use the
t-distribution for benchmarking. Anyway, I have implemented the
workaround code below in terms of the standard normal, so we are
using it here.
*/
// -- Calculate rescaled benchmark intervals
mata {
b = st_matrix("r(b)")
se = st_matrix("r(se)")
trunc = normal((b+(0, J(1, 5, 1))):/se)-normal((b-(1, J(1, 5, 1))):/se)
st_matrix("trunc", trunc)
st_matrixcolstripe("trunc", st_matrixcolstripe("r(b)"))
st_matrix("p_cum_trunc", st_matrix("r(p_cum)"):/trunc)
st_matrixcolstripe("p_cum_trunc", st_matrixcolstripe("r(b)"))
}
// Below are the original IMPs and cumulative IMPs
// Note: r1 is the highest benchmark level, r6 the lowest
matlist r(imp)
matlist r(p_cum)
/*
See where the problem comes from? All coefficients have a 90 percent
chance (or higher) to fall into the highest interval. The
probabilities for each of being lower than the highest interval are
near 0. This is because there is (mathematically) a probability of
~ 10 percent for the coefficients to exceed 1.
OK, let's fix this.
Below are the truncated cumulative probabilities for the interval
[-1; 1]. These probabilities represent the coefficient-specific
upper bounds. Because percent agreement cannot be below 0, we define
the respective interval as [0; 1]. To be honest, I do not know
whether benchmarking the percent agreement makes sense; I
doubt it but I include the interval for consistency.
*/
matlist trunc
/*
Finally, we have the rescaled cumulative probabilities below.
Theoretically, these sum to 1 (and they do). The reason for the
probability associated with the lowest interval (r6) being off
is a technical flaw in the workaround: As you can see from the
original IMPs, -kappaetc- fixes the lower bounds at 1 before
summing from the top. Because the workaround does not recalculate
the sum, the lowest interval now has probability 1/tCMP, with
tCMP := truncated cumulative probability. Had we used the actual
sum, the lowest category would also be 1.
*/
matlist p_cum_trunc // <- rescaled cumulative probabilities
/*
You will have to select the interval yourself. Pick the first one,
starting from r1, that exceeds the threshold (usually 95%).
If you cannot remember them, you can see the (upper) interval
limits in r(benchmarks).
*/
matlist r(benchmarks)
. which kappaetc c:\ado\plus\k\kappaetc.ado *! version 2.1.0 11aug2022 daniel klein
Comment