After reading this blog post by Thom Baguley (see also this R code), I tidied up my code from #13 a bit and added the method due to Fieller et al. (1957). I'm posting it in case someone finds it useful.
Code:
* File: CI_for_Spearman_rho.do * Date: 5-Jan-2023 * Name: Bruce Weaver, [email protected] // Modification of Maarten Buis's code in #4 in the Statalist // thread shown in the References below. // Open and prepare some example data sysuse auto, clear keep if !missing(price, weight) egen iprice = rank(price) egen iweight = rank(weight) // For Pearson's r, it is customary to use sqrt1/(n - 3) as the SE // when computing a CI for Zr. But (at least) two other suggestions // have been made when one is computing Spearman's rho. // Fieller et al. (1957) suggested that for |Spearman rho| <= .8, // one should use sqrt(1.06/(n - 3)) as the SE. // Bonett and Wright (2000) say that for absolute values of Spearman // correlations < .95, one should use sqrt((1+r^2/2)/(n-3)) as the SE // of atanh(rs), not sqrt(1/(n-3)), as one does for Pearson r. // Let's compute the 95% CI for Spearman's rho "by hand" using // all 3 of those SEs for Zr. // On row 1, use SEzr = sqrt(1/(n-3)) to match the results from -corrci-. // On row 2, use SEzr = sqrt(1.06/(n - 3)), as suggested by Fieller (1957). // On row 3, use SEzr = sqrt((1+rs^2/2)/(n-3)), as suggested by B&W (2000). quietly { pwcorr iprice iweight // Pearson r on the ranks = Spearman rho generate double zcrit = invnormal(0.975) in 1/3 generate n = r(N) in 1/3 generate double r = r(rho) in 1/3 generate double zr = atanh(r) in 1/3 // Make no adjustment to SEzr in row 1 generate double SEzr = sqrt(1/(n-3)) in 1 // Use Fieller's (1957) correction in row 2 replace SEzr = sqrt(1.06/(n - 3)) in 2 // Use the Bonnet & Write (2000) correction in row 3 replace SEzr = sqrt((1+r^2/2)/(n-3)) in 3 generate double lb = tanh(zr - (zcrit*SEzr)) in 1/3 generate double ub = tanh(zr + (zcrit*SEzr)) in 1/3 generate str25 SE = "sqrt(1/(n-3))" in 1 replace SE = "sqrt(1.06/(n - 3))" in 2 replace SE = "sqrt((1+r^2/2)/(n-3))" in 3 generate str25 Correction = "none" replace Correction = "Fieller (1957)" in 2 replace Correction = "B&W (2000)" in 3 } format r-ub %9.6f list zcrit-Correction in 1/3 // Compare to -corrci- using ranked data corrci iprice iweight matrix list r(lb) matrix list r(ub) // Notice that the CI I computed in row 1 matches the CI obtained via // -corrci-. This confirms that my "hand" calcuations are correct. // check CIs against the bootstrap set seed 1234 bootstrap r(rho), reps(20000) nodots bca : spearman price weight estat bootstrap, bca list zcrit-Correction in 1/3 // Notice that the Bonett & Wright (2000) CI in row 3 // is very similar to the BCa interval. /* References Bonett, D. G., & Wright, T. A. (2000). Sample size requirements for estimating Pearson, Kendall and Spearman correlations. Psychometrika, 65(1), 23-28. Fieller, E. C., Hartley, H. O., & Pearson, E. S. (1957). Tests for rank correlation coefficients: I. Biometrika, 44, 470–481. https://www.youtube.com/watch?v=gKHgCq5E864 https://www.statalist.org/forums/forum/general-stata-discussion/general/1395923-spearman-correlation-95-confidence-interval https://seriousstats.wordpress.com/2020/05/18/cis-for-spearmans-rho-and-kendalls-tau/ https://rpubs.com/seriousstats/616206 */
Comment