Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • PCA - low variable loadings on first component

    Dear All,
    I perform a PCA on 40 variables & keep all components with a min. eigenvalue of 1 (which results in a total of 5 components) - STATA gives me the following output:
    Code:
    >pca *all variables*, mineigen(1)
    
    
     Component |   Eigenvalue   Difference         Proportion   Cumulative
        -------------+------------------------------------------------------------
               Comp1 |      29.0991      26.3849             0.7275       0.7275
               Comp2 |      2.71421      .672651             0.0679       0.7953
               Comp3 |      2.04156      .803292             0.0510       0.8464
               Comp4 |      1.23826      .201987             0.0310       0.8773
               Comp5 |      1.03628      .170083             0.0259       0.9032
    
    
    Principal components (eigenvectors) 
    
    
        -----------------------------------------------------------------------------
            Variable |    Comp1     Comp2     Comp3     Comp4     Comp5 | Unexplained 
        -------------+--------------------------------------------------+------------
     R_wt1_GLR~13 |   0.0036    0.2762   -0.1522    0.4151    0.3403 |       .4118 
       R_wt7_FO14 |   0.0689    0.0353   -0.2479   -0.2684    0.3559 |       .5124 
        std_GL~24 |   0.1825   -0.0001    0.0000   -0.0959   -0.0131 |       .0191 
        std_GLR~6 |   0.1712    0.0201    0.1849   -0.1215   -0.0321 |      .05681 
        std_GLR~8 |   0.1822    0.0119   -0.0235   -0.0713   -0.0102 |      .02624 
        wt1_GL~13 |   0.0263    0.4621    0.3717    0.0753    0.1531 |      .08684 
        wt1_GL~24 |   0.1819    0.0148    0.0268   -0.0981   -0.0225 |       .0222 
        wt1_GLR~6 |   0.1615    0.0390    0.2516   -0.1209   -0.0765 |      .08364 
        wt1_GLR~8 |   0.1815    0.0138    0.0119   -0.0902   -0.0269 |      .02935 
        wt1_GLR~9 |   0.0192    0.4457    0.3554    0.1194    0.1716 |       .1442 
        wt2_GL~24 |   0.1832   -0.0186   -0.0229    0.0034   -0.0486 |      .01839 
        wt2_GLR~3 |   0.0133    0.4138   -0.1428    0.0898   -0.4108 |       .3035 
        wt2_GLR~5 |   0.1828    0.0111    0.0114   -0.0629   -0.0511 |      .01971 
        wt2_GLR~6 |   0.1758    0.0030    0.1206   -0.0245   -0.0877 |      .06209 
        wt2_GLR~8 |   0.1826   -0.0273   -0.0410    0.0064   -0.0405 |      .02255 
        wt3_GLR~5 |   0.1809    0.0139    0.0323   -0.0957   -0.0601 |      .02963 
        wt3_GLR~6 |   0.1601    0.0576    0.2214   -0.1942   -0.1267 |      .08216 
        wt4_GL~24 |   0.1801    0.0136    0.0055   -0.1630    0.0246 |      .02207 
        wt4_GLR~5 |   0.1823    0.0107    0.0143   -0.0683   -0.0444 |      .02444 
        wt4_GLR~6 |   0.1620    0.0340    0.1740   -0.1918   -0.0349 |       .1246 
        wt5_GL~24 |   0.1793   -0.0337    0.0051    0.0841   -0.0665 |      .04766 
        wt5_GLR~8 |   0.1783   -0.0447    0.0179    0.1118   -0.0840 |      .04559 
        wt7_GL~24 |   0.1823   -0.0213    0.0006   -0.0537    0.0023 |      .02847 
        wt7_GLR~5 |   0.1835   -0.0185   -0.0012   -0.0256   -0.0322 |      .01707 
        wt7_GLR~8 |   0.1815   -0.0169   -0.0189   -0.0773    0.0129 |      .03221 
          wt8_FO9 |   0.0530    0.2385   -0.4929   -0.0310   -0.2132 |       .2196 
        wt8_GLR~8 |   0.1813   -0.0263    0.0131    0.0054   -0.0326 |      .03966 
         wt1_FO10 |  -0.0476   -0.4309    0.1862    0.0542    0.0003 |       .3557 
          wt2_FO4 |   0.1184    0.2212   -0.3360   -0.2154   -0.0222 |       .1705 
         wt2_LM24 |   0.1768   -0.0569   -0.0708    0.1434    0.1202 |      .03092 
          wt2_LM8 |   0.1760   -0.0574   -0.0765    0.1076    0.0800 |      .05662 
         wt4_LM24 |   0.1802   -0.0441   -0.0376    0.1390    0.0486 |      .02063 
          wt4_LM6 |   0.1576   -0.0866    0.0472    0.3475    0.0284 |       .1015 
          wt4_LM8 |   0.1803   -0.0384   -0.0418    0.0807    0.0302 |      .03795 
         wt5_LM24 |   0.1805   -0.0226   -0.0490    0.0956    0.0570 |      .03072 
         wt6_LM24 |   0.1773   -0.0542   -0.0484    0.2013    0.0248 |        .022 
          wt6_LM6 |   0.1606   -0.0935    0.0132    0.3480    0.0376 |      .07385 
         wt7_LM24 |   0.1768   -0.0383   -0.0915    0.1524    0.0798 |      .03353 
          wt7_LM8 |   0.1769   -0.0521   -0.0889    0.1051    0.0278 |      .05139 
         wt8_LM10 |   0.0656    0.0088   -0.0737   -0.2891    0.6266 |       .3532 
        ------------------------------------------------------------------------------
    I was wondering why the loading for all variables on Comp1 is so low (basically for all <0.2), whereas I have 4 variables for Comp2, Comp3 and Comp5 and 3 variables for Comp4 which are all >0.3. Is there any good explanation for that?

    Thank´s for your help!

    PS: I´ve cross-posted this question on http://stackoverflow.com/questions/3...irst-component



    Last edited by Michael Tricodur; 11 Sep 2015, 04:58.

  • #2
    Michael, did you try to use the post command rotate after the factor/PCA analysis? It is more common to interpret the rotate factor solution rather than the unrotate solution. I assume, the loading on the rotate solution will satisfy your expectation for higher loading (>.3) on the first extracted factor.

    Comment


    • #3
      If many variables are highly correlated with each other then PC1 is roughly an average of those and loadings may well be low. Use -pcacoefsave- (SSC) to look at correlations as well as loadings.

      Comment


      • #4
        Here is an example. By deliberate choice all variables are, loosely, measures of car size. Their relatively high correlations with PC1 aren't matched by loadings that look so high. This is a common phenomenon but should not seem puzzling. I used pcacoefsave (SSC). The effect can easily be more marked with more variables.

        Code:
        . sysuse auto
        (1978 Automobile Data)
        
        . pca trunk headroom weight length displacement
        
        Principal components/correlation                 Number of obs    =         74
                                                         Number of comp.  =          5
                                                         Trace            =          5
            Rotation: (unrotated = principal)            Rho              =     1.0000
        
            --------------------------------------------------------------------------
               Component |   Eigenvalue   Difference         Proportion   Cumulative
            -------------+------------------------------------------------------------
                   Comp1 |      3.76201        3.026             0.7524       0.7524
                   Comp2 |      .736006      .427915             0.1472       0.8996
                   Comp3 |      .308091      .155465             0.0616       0.9612
                   Comp4 |      .152627      .111357             0.0305       0.9917
                   Comp5 |     .0412693            .             0.0083       1.0000
            --------------------------------------------------------------------------
        
        Principal components (eigenvectors) 
        
            ----------------------------------------------------------------
                Variable |    Comp1     Comp2     Comp3     Comp4     Comp5 
            -------------+--------------------------------------------------
                   trunk |   0.4334    0.3665   -0.7676    0.2914    0.0612 
                headroom |   0.3587    0.7640    0.5224   -0.1209    0.0130 
                  weight |   0.4842   -0.3329    0.0737   -0.2669    0.7603 
                  length |   0.4863   -0.2372   -0.1050   -0.5745   -0.6051 
            displacement |   0.4610   -0.3390    0.3484    0.7065   -0.2279 
            ----------------------------------------------------------------
        
            ---------------------------
                Variable | Unexplained 
            -------------+-------------
                   trunk |           0 
                headroom |           0 
                  weight |           0 
                  length |           0 
            displacement |           0 
            ---------------------------
        
        . pcacoefsave using example
        file example.dta saved
        
        . u example
        
        . format corr %4.3f
        
        .  list PC varname corr
        
             +----------------------------+
             | PC        varname     corr |
             |----------------------------|
          1. |  1         length    0.943 |
          2. |  1         weight    0.939 |
          3. |  1   displacement    0.894 |
          4. |  1          trunk    0.841 |
          5. |  1       headroom    0.696 |
             |----------------------------|
          6. |  2   displacement   -0.291 |
          7. |  2         weight   -0.286 |
          8. |  2          trunk    0.314 |
          9. |  2       headroom    0.655 |
         10. |  2         length   -0.204 |
             |----------------------------|
         11. |  3         weight    0.041 |
         12. |  3          trunk   -0.426 |
         13. |  3         length   -0.058 |
         14. |  3   displacement    0.193 |
         15. |  3       headroom    0.290 |
             |----------------------------|
         16. |  4         length   -0.224 |
         17. |  4       headroom   -0.047 |
         18. |  4         weight   -0.104 |
         19. |  4   displacement    0.276 |
         20. |  4          trunk    0.114 |
             |----------------------------|
         21. |  5         weight    0.154 |
         22. |  5   displacement   -0.046 |
         23. |  5         length   -0.123 |
         24. |  5       headroom    0.003 |
         25. |  5          trunk    0.012 |
             +----------------------------+

        Comment


        • #5
          Thank you for your helpful suggestions. Indeed, many variables were highly correlated with each other!

          Comment


          • #6
            See now also -cpcorr- (SSC) for a complementary command.

            Comment


            • #7
              Hello, I am new user of this forum, and I am not sure whether I posted my question appropriate place, sorry for that.
              I am running PCA for determining gentrification score for census tracts. Actually, I am trying to create gentrification scores for census tracts between two time points such as how much census tract A gentrified between 1990 and 2000 or 2000 and 2010 etc. I have 17 variables which are theoretically related to gentrification such as changes in total population or changes in percentage of professional job or changes in median home rent value etc. When I run PCA for these changes between 1990 and 2000 or 2000-2010 or else, I had low PCA loadings all of them below 0.5. I attached an example what I obtained as a PCA result. Could you help me to understand why I have low loadings and how I can solve this issue. By the way Mr.Cox, I applied your previous suggestions and I also attached correlation results as well. Thanks in advance.
              Attached Files

              Comment


              • #8
                If interested in #7 please follow the thread at https://www.statalist.org/forums/for...mponent-of-pca

                Comment

                Working...
                X