Hello dear STATA users,
I am struggling to understand the index score generation using PCA. First of all, I scaled my variables between zero (worst) and 1 (best) because the original variables have wildly different scales. Then I run PCA on the scaled variables.
Please see the PCA output below.
. pca sc_vul_m sc_vul_ex sc_n5y sc_od
Principal components/correlation Number of obs = 27
Number of comp. = 4
Trace = 4
Rotation: (unrotated = principal) Rho = 1.0000
--------------------------------------------------------------------------
Component | Eigenvalue Difference Proportion Cumulative
-------------+------------------------------------------------------------
Comp1 | 2.34104 1.33824 0.5853 0.5853
Comp2 | 1.0028 .380757 0.2507 0.8360
Comp3 | .622046 .587939 0.1555 0.9915
Comp4 | .0341067 . 0.0085 1.0000
--------------------------------------------------------------------------
Principal components (eigenvectors)
--------------------------------------------------------------------
Variable | Comp1 Comp2 Comp3 Comp4 | Unexplained
-------------+----------------------------------------+-------------
sc_vul_m | 0.6293 -0.0023 -0.2984 -0.7176 | 0
sc_vul_ex | 0.6205 0.0164 -0.3629 0.6950 | 0
sc_n5y | 0.0812 0.9813 0.1747 -0.0045 | 0
sc_od | 0.4609 -0.1920 0.8653 0.0450 | 0
--------------------------------------------------------------------
Based on my research, after PCA, I needed to rotate the components and predict the PC1 to use that score as an "index" score. Using basically rotate and predict commands.
My first question is: as you can see here the sc_n5y is loaded on the PC2 whereas the remaining are on PC1. So, when I rotate and predict the score, the sc_n5y will not contribute that score since it will be PC1 based right? Is there any way to count for both PC1 and PC2 and generate an index score out of those?
My second question is about the rotate function. When I only type rotate , the rotation is applied to the all PCs however, I don't need to take PC3 and PC4 into account since they don't provide useful information. In such case, if I run rotate comp1 comp2 my result are different than rotate results. In this case, in order to achieve an index score from this PCA, which one shall I choose? I see this being a problem because of lack of correlation between sc_n5y and others. Is there any way to count all in one index thru the PCA analysis?
Many thanks!
Gizem
I am struggling to understand the index score generation using PCA. First of all, I scaled my variables between zero (worst) and 1 (best) because the original variables have wildly different scales. Then I run PCA on the scaled variables.
Please see the PCA output below.
. pca sc_vul_m sc_vul_ex sc_n5y sc_od
Principal components/correlation Number of obs = 27
Number of comp. = 4
Trace = 4
Rotation: (unrotated = principal) Rho = 1.0000
--------------------------------------------------------------------------
Component | Eigenvalue Difference Proportion Cumulative
-------------+------------------------------------------------------------
Comp1 | 2.34104 1.33824 0.5853 0.5853
Comp2 | 1.0028 .380757 0.2507 0.8360
Comp3 | .622046 .587939 0.1555 0.9915
Comp4 | .0341067 . 0.0085 1.0000
--------------------------------------------------------------------------
Principal components (eigenvectors)
--------------------------------------------------------------------
Variable | Comp1 Comp2 Comp3 Comp4 | Unexplained
-------------+----------------------------------------+-------------
sc_vul_m | 0.6293 -0.0023 -0.2984 -0.7176 | 0
sc_vul_ex | 0.6205 0.0164 -0.3629 0.6950 | 0
sc_n5y | 0.0812 0.9813 0.1747 -0.0045 | 0
sc_od | 0.4609 -0.1920 0.8653 0.0450 | 0
--------------------------------------------------------------------
Based on my research, after PCA, I needed to rotate the components and predict the PC1 to use that score as an "index" score. Using basically rotate and predict commands.
My first question is: as you can see here the sc_n5y is loaded on the PC2 whereas the remaining are on PC1. So, when I rotate and predict the score, the sc_n5y will not contribute that score since it will be PC1 based right? Is there any way to count for both PC1 and PC2 and generate an index score out of those?
My second question is about the rotate function. When I only type rotate , the rotation is applied to the all PCs however, I don't need to take PC3 and PC4 into account since they don't provide useful information. In such case, if I run rotate comp1 comp2 my result are different than rotate results. In this case, in order to achieve an index score from this PCA, which one shall I choose? I see this being a problem because of lack of correlation between sc_n5y and others. Is there any way to count all in one index thru the PCA analysis?
Many thanks!
Gizem
Comment