95% Confidence Intervals after Canonical Correlation Analysis

Gianfranco Di Gennaro

Join Date: Oct 2020
Posts: 144

95% Confidence Intervals after Canonical Correlation Analysis

25 Mar 2026, 16:34

Dear all,

I am performing a canonical correlation analysis in Stata using two sets of variables:

Code:

canon (Lacunes Fazekasscale PerivascularspacesMB PerivascularspacesBG ///
       PerivascularSpacesSC Microbleeds CorticalSiderosis Atrophy) ///
      (TMTav MassetermuclevolumeAVERAGEc Massetermuclefatinfiltration ///
       Tonguevolumecm TonguefatinfiltrationMercuri Subcutaneousfatthicknessmm)

The first canonical correlation is 0.5112. (see belo in red)

Code:

Canonical correlation analysis                      Number of obs =         71

Raw coefficients for the first variable set

                 |        1         2         3         4         5         6 
    -------------+------------------------------------------------------------
         Lacunes |  -1.3263    1.4417   -1.9829    1.0260   -0.3856   -0.6729 
    Fazekasscale |  -0.1531   -0.8536    1.0580    0.7955    0.2851    0.6343 
    Perivascul~B |  -0.5809    0.5796    1.2162    0.3419    1.2116   -0.8161 
    Perivascul~G |   0.8922    0.3323   -0.1177    0.6821   -0.5203   -0.1034 
    Perivascul~C |   0.6635   -0.4927   -0.3623   -0.1896    0.5910   -0.0053 
     Microbleeds |   0.4629    1.9109    1.0308   -0.8495   -0.7975    1.0392 
    CorticalSi~s |  -1.4955   -0.5429   -0.8916   -0.3480    0.9793    1.2079 
         Atrophy |  -0.4463   -1.4652    0.5596   -1.0068   -0.6635   -0.6175 
    --------------------------------------------------------------------------

Raw coefficients for the second variable set

                 |        1         2         3         4         5         6 
    -------------+------------------------------------------------------------
           TMTav |  -0.0874   -0.1150   -0.1388   -0.1092    0.4314    0.5491 
    Massetermu~c |   0.0301    0.1296    0.0213    0.1625   -0.0412   -0.0336 
    Massetermu~n |   1.5585    0.3037   -0.9663   -0.2374    0.4310    0.0807 
    Tonguevolu~m |  -0.0260    0.0414   -0.0075   -0.0743   -0.0466    0.0390 
    Tonguefati~i |   0.6934   -0.3198    1.3798   -0.6041   -0.3779    1.2642 
    Subcutaneo~m |   0.1019    0.1839    0.1802   -0.0967    0.2952   -0.2100 
    --------------------------------------------------------------------------

----------------------------------------------------------------------------
Canonical correlations:
  0.5112  0.4158  0.2832  0.2130  0.1291  0.0820

----------------------------------------------------------------------------
Tests of significance of all canonical correlations

                         Statistic      df1      df2            F     Prob>F
         Wilks' lambda     .524015       48  284.526       0.8320     0.7768 a
        Pillai's trace     .583129       48      372       0.8343     0.7760 a
Lawley-Hotelling trace     .721154       48      332       0.8313     0.7795 a
    Roy's largest root     .353697        8       62       2.7412     0.0117 u
----------------------------------------------------------------------------
                            e = exact, a = approximate, u = upper bound on F

. 
end of do-file

I would like to compute a 95% confidence interval for this canonical correlation.

However:
- the command `canon` does not seem to store the canonical correlations in r(), and
- I have not found a built-in way in Stata to obtain confidence intervals for canonical correlations.

I tried to implement a bootstrap procedure, but I was not able to correctly extract the canonical correlations within a program.

My questions are:
1. Is there a recommended way in Stata to obtain confidence intervals for canonical correlations?
2. Is bootstrap the preferred approach, and if so, how can the canonical correlations be correctly extracted within a bootstrap program?

Any suggestion or example code would be greatly appreciated.

Best regards,
Gianfranco

Tags: bootstrapping, canonical correlation, confidence intervals

Felix Bittmann

Join Date: Aug 2018
Posts: 838

26 Mar 2026, 01:05

While I cannot tell you if bootstrapping is the best or most valid approach, you can get the CIs as such:

Code:

cap program drop ccorr
program define ccorr, rclass
    canon (length weight headroom trunk) (displ mpg gear_ratio turn)
    return scalar c1 = e(ccorr)[1,1]
end
sysuse auto, clear
bootstrap c1=r(c1), seed(123): ccorr
estat bootstrap, bc

Best wishes

Stata 18.0 MP | ORCID | Google Scholar

Announcement

95% Confidence Intervals after Canonical Correlation Analysis

Comment