Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    After reading this blog post by Thom Baguley (see also this R code), I tidied up my code from #13 a bit and added the method due to Fieller et al. (1957). I'm posting it in case someone finds it useful.

    Code:
    * File:  CI_for_Spearman_rho.do
    * Date:  5-Jan-2023
    * Name:  Bruce Weaver, [email protected]
    
    // Modification of Maarten Buis's code in #4 in the Statalist
    // thread shown in the References below.
    // Open and prepare some example data
    sysuse auto, clear
    keep if !missing(price, weight)
    egen iprice = rank(price)
    egen iweight = rank(weight)
    
    // For Pearson's r, it is customary to use sqrt1/(n - 3) as the SE
    // when computing a CI for Zr.  But (at least) two other suggestions
    // have been made when one is computing Spearman's rho.
    
    // Fieller et al. (1957) suggested that for |Spearman rho| <= .8,
    // one should use sqrt(1.06/(n - 3)) as the SE.
    
    // Bonett and Wright (2000) say that for absolute values of Spearman
    // correlations < .95, one should use sqrt((1+r^2/2)/(n-3)) as the SE
    // of atanh(rs), not sqrt(1/(n-3)), as one does for Pearson r.
    
    // Let's compute the 95% CI for Spearman's rho "by hand" using
    // all 3 of those SEs for Zr.
    // On row 1, use SEzr = sqrt(1/(n-3)) to match the results from -corrci-.
    // On row 2, use SEzr = sqrt(1.06/(n - 3)), as suggested by Fieller (1957).
    // On row 3, use SEzr = sqrt((1+rs^2/2)/(n-3)), as suggested by B&W (2000).
    
    quietly {
    pwcorr iprice iweight // Pearson r on the ranks = Spearman rho
    generate double zcrit = invnormal(0.975) in 1/3
    generate n = r(N) in 1/3
    generate double r = r(rho) in 1/3
    generate double zr = atanh(r) in 1/3
    // Make no adjustment to SEzr in row 1
    generate double SEzr = sqrt(1/(n-3)) in 1
    // Use Fieller's (1957) correction in row 2
    replace SEzr = sqrt(1.06/(n - 3)) in 2
    // Use the Bonnet & Write (2000) correction in row 3
    replace SEzr = sqrt((1+r^2/2)/(n-3)) in 3
    generate double lb = tanh(zr - (zcrit*SEzr)) in 1/3
    generate double ub = tanh(zr + (zcrit*SEzr)) in 1/3
    generate str25 SE = "sqrt(1/(n-3))" in 1
    replace SE = "sqrt(1.06/(n - 3))" in 2
    replace SE = "sqrt((1+r^2/2)/(n-3))" in 3
    generate str25 Correction = "none"
    replace Correction = "Fieller (1957)" in 2
    replace Correction = "B&W (2000)" in 3
    }
    
    format r-ub %9.6f
    list zcrit-Correction in 1/3
     
    // Compare to -corrci- using ranked data
    corrci iprice iweight
    matrix list r(lb)
    matrix list r(ub)
    // Notice that the CI I computed in row 1 matches the CI obtained via
    // -corrci-.  This confirms that my "hand" calcuations are correct.
    
    // check CIs against the bootstrap
    set seed 1234
    bootstrap r(rho), reps(20000) nodots bca : spearman price weight
    estat bootstrap, bca
    list zcrit-Correction in 1/3
    // Notice that the Bonett & Wright (2000) CI in row 3
    // is very similar to the BCa interval.
    
    /*
    References
    
    Bonett, D. G., & Wright, T. A. (2000). Sample size requirements
      for estimating Pearson, Kendall and Spearman correlations.
      Psychometrika, 65(1), 23-28.
     
    Fieller, E. C., Hartley, H. O., & Pearson, E. S. (1957). Tests for
      rank correlation coefficients: I. Biometrika, 44, 470–481.
     
    https://www.youtube.com/watch?v=gKHgCq5E864
    
    https://www.statalist.org/forums/forum/general-stata-discussion/general/1395923-spearman-correlation-95-confidence-interval
    
    https://seriousstats.wordpress.com/2020/05/18/cis-for-spearmans-rho-and-kendalls-tau/
    
    https://rpubs.com/seriousstats/616206
    
    */
    --
    Bruce Weaver
    Email: [email protected]
    Version: Stata/MP 18.5 (Windows)

    Comment

    Working...
    X