Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Principal component analysis with the scale of original data

    Hi,

    I conducted a survey in which I asked 9 questions, each ranging from a scale from 1 to 7.
    I then conducted a PCA in Stata in order to reduce these 9 questions to 3 components:

    pca Vraag4contr Vraag5contr Vraag6contr Vraag10comp Vraag11comp Vraag12comp Vraag4DTT Vraag5DTT Vraag6DTT

    Principal components/correlation Number of obs = 294
    Number of comp. = 9
    Trace = 9
    Rotation: (unrotated = principal) Rho = 1.0000

    --------------------------------------------------------------------------
    Component | Eigenvalue Difference Proportion Cumulative
    -------------+------------------------------------------------------------
    Comp1 | 3.03713 .969488 0.3375 0.3375
    Comp2 | 2.06764 .958044 0.2297 0.5672
    Comp3 | 1.1096 .473579 0.1233 0.6905
    Comp4 | .636017 .0811268 0.0707 0.7612
    Comp5 | .55489 .0484921 0.0617 0.8228
    Comp6 | .506398 .127643 0.0563 0.8791
    Comp7 | .378756 .00147694 0.0421 0.9212
    Comp8 | .377279 .0449838 0.0419 0.9631
    Comp9 | .332295 . 0.0369 1.0000
    --------------------------------------------------------------------------

    Principal components (eigenvectors)

    ----------------------------------------------------------------------------------------------------------------------
    Variable | Comp1 Comp2 Comp3 Comp4 Comp5 Comp6 Comp7 Comp8 Comp9 | Unexplained
    -------------+------------------------------------------------------------------------------------------+-------------
    Vraag4contr | 0.3808 -0.0640 -0.5154 -0.3803 0.1380 0.1632 0.2388 0.3757 0.4436 | 0
    Vraag5contr | 0.4395 -0.0282 -0.3784 -0.1669 0.0883 0.1366 -0.1332 -0.6362 -0.4314 | 0
    Vraag6contr | 0.4135 -0.0745 -0.1991 0.5100 0.0114 -0.6957 -0.1486 0.0940 0.0930 | 0
    Vraag10comp | 0.3706 0.0149 0.5657 -0.2592 -0.0387 -0.2465 0.4714 -0.3485 0.2632 | 0
    Vraag11comp | 0.4113 -0.0417 0.4413 -0.2787 0.2342 0.0481 -0.4903 0.4285 -0.2747 | 0
    Vraag12comp | 0.4136 -0.0570 0.1517 0.5575 -0.3276 0.5961 0.1430 0.0916 0.0189 | 0
    Vraag4DTT | 0.0729 0.5911 -0.0083 -0.1059 -0.3631 0.0445 -0.5153 -0.1808 0.4494 | 0
    Vraag5DTT | 0.0871 0.5857 -0.1057 -0.1110 -0.3246 -0.1729 0.3728 0.3090 -0.5060 | 0
    Vraag6DTT | 0.0161 0.5402 0.0498 0.2972 0.7564 0.1427 0.1247 -0.0607 0.0733 | 0
    ----------------------------------------------------------------------------------------------------------------------

    . pca Vraag4contr Vraag5contr Vraag6contr Vraag10comp Vraag11comp Vraag12comp Vraag4DTT Vraag5DTT Vraag6DTT, mineigen(1)

    Principal components/correlation Number of obs = 294
    Number of comp. = 3
    Trace = 9
    Rotation: (unrotated = principal) Rho = 0.6905

    --------------------------------------------------------------------------
    Component | Eigenvalue Difference Proportion Cumulative
    -------------+------------------------------------------------------------
    Comp1 | 3.03713 .969488 0.3375 0.3375
    Comp2 | 2.06764 .958044 0.2297 0.5672
    Comp3 | 1.1096 .473579 0.1233 0.6905
    Comp4 | .636017 .0811268 0.0707 0.7612
    Comp5 | .55489 .0484921 0.0617 0.8228
    Comp6 | .506398 .127643 0.0563 0.8791
    Comp7 | .378756 .00147694 0.0421 0.9212
    Comp8 | .377279 .0449838 0.0419 0.9631
    Comp9 | .332295 . 0.0369 1.0000
    --------------------------------------------------------------------------

    Principal components (eigenvectors)

    ----------------------------------------------------------
    Variable | Comp1 Comp2 Comp3 | Unexplained
    -------------+------------------------------+-------------
    Vraag4contr | 0.3808 -0.0640 -0.5154 | .2563
    Vraag5contr | 0.4395 -0.0282 -0.3784 | .2527
    Vraag6contr | 0.4135 -0.0745 -0.1991 | .4252
    Vraag10comp | 0.3706 0.0149 0.5657 | .2273
    Vraag11comp | 0.4113 -0.0417 0.4413 | .2664
    Vraag12comp | 0.4136 -0.0570 0.1517 | .4482
    Vraag4DTT | 0.0729 0.5911 -0.0083 | .2613
    Vraag5DTT | 0.0871 0.5857 -0.1057 | .2552
    Vraag6DTT | 0.0161 0.5402 0.0498 | .393
    ----------------------------------------------------------

    . pca Vraag4contr Vraag5contr Vraag6contr Vraag10comp Vraag11comp Vraag12comp Vraag4DTT Vraag5DTT Vraag6DTT, mineigen(1) blanks(0.3)

    Principal components/correlation Number of obs = 294
    Number of comp. = 3
    Trace = 9
    Rotation: (unrotated = principal) Rho = 0.6905

    --------------------------------------------------------------------------
    Component | Eigenvalue Difference Proportion Cumulative
    -------------+------------------------------------------------------------
    Comp1 | 3.03713 .969488 0.3375 0.3375
    Comp2 | 2.06764 .958044 0.2297 0.5672
    Comp3 | 1.1096 .473579 0.1233 0.6905
    Comp4 | .636017 .0811268 0.0707 0.7612
    Comp5 | .55489 .0484921 0.0617 0.8228
    Comp6 | .506398 .127643 0.0563 0.8791
    Comp7 | .378756 .00147694 0.0421 0.9212
    Comp8 | .377279 .0449838 0.0419 0.9631
    Comp9 | .332295 . 0.0369 1.0000
    --------------------------------------------------------------------------

    Principal components (eigenvectors) (blanks are abs(loading)<.3)

    ----------------------------------------------------------
    Variable | Comp1 Comp2 Comp3 | Unexplained
    -------------+------------------------------+-------------
    Vraag4contr | 0.3808 -0.5154 | .2563
    Vraag5contr | 0.4395 -0.3784 | .2527
    Vraag6contr | 0.4135 | .4252
    Vraag10comp | 0.3706 0.5657 | .2273
    Vraag11comp | 0.4113 0.4413 | .2664
    Vraag12comp | 0.4136 | .4482
    Vraag4DTT | 0.5911 | .2613
    Vraag5DTT | 0.5857 | .2552
    Vraag6DTT | 0.5402 | .393
    ----------------------------------------------------------

    . rotate, promax(3) oblique

    Principal components/correlation Number of obs = 294
    Number of comp. = 3
    Trace = 9
    Rotation: oblique promax (Kaiser off) Rho = 0.6905

    --------------------------------------------------------------------------
    Component | Variance Proportion Rotated comp. are correlated
    -------------+------------------------------------------------------------
    Comp1 | 2.09808 0.2331
    Comp2 | 2.0778 0.2309
    Comp3 | 1.90081 0.2112
    --------------------------------------------------------------------------

    Rotated components

    ----------------------------------------------------------
    Variable | Comp1 Comp2 Comp3 | Unexplained
    -------------+------------------------------+-------------
    Vraag4contr | 0.6396 0.0031 -0.1564 | .2563
    Vraag5contr | 0.5813 0.0383 -0.0135 | .2527
    Vraag6contr | 0.4407 -0.0193 0.1103 | .4252
    Vraag10comp | -0.1375 0.0272 0.6749 | .2273
    Vraag11comp | -0.0156 -0.0184 0.6055 | .2664
    Vraag12comp | 0.1917 -0.0191 0.3824 | .4482
    Vraag4DTT | 0.0031 0.5954 0.0263 | .2613
    Vraag5DTT | 0.0823 0.5963 -0.0402 | .2552
    Vraag6DTT | -0.0735 0.5358 0.0365 | .393
    ----------------------------------------------------------

    Component rotation matrix

    --------------------------------------------
    | Comp1 Comp2 Comp3
    -------------+------------------------------
    Comp1 | 0.7094 0.1086 0.6365
    Comp2 | -0.0922 0.9931 -0.0231
    Comp3 | -0.7053 -0.0491 0.7767
    --------------------------------------------

    . rotate, promax(3) oblique blanks(0.3)

    Principal components/correlation Number of obs = 294
    Number of comp. = 3
    Trace = 9
    Rotation: oblique promax (Kaiser off) Rho = 0.6905

    --------------------------------------------------------------------------
    Component | Variance Proportion Rotated comp. are correlated
    -------------+------------------------------------------------------------
    Comp1 | 2.09808 0.2331
    Comp2 | 2.0778 0.2309
    Comp3 | 1.90081 0.2112
    --------------------------------------------------------------------------

    Rotated components (blanks are abs(loading)<.3)

    ----------------------------------------------------------
    Variable | Comp1 Comp2 Comp3 | Unexplained
    -------------+------------------------------+-------------
    Vraag4contr | 0.6396 | .2563
    Vraag5contr | 0.5813 | .2527
    Vraag6contr | 0.4407 | .4252
    Vraag10comp | 0.6749 | .2273
    Vraag11comp | 0.6055 | .2664
    Vraag12comp | 0.3824 | .4482
    Vraag4DTT | 0.5954 | .2613
    Vraag5DTT | 0.5963 | .2552
    Vraag6DTT | 0.5358 | .393
    ----------------------------------------------------------

    Component rotation matrix

    --------------------------------------------
    | Comp1 Comp2 Comp3
    -------------+------------------------------
    Comp1 | 0.7094 0.1086 0.6365
    Comp2 | -0.0922 0.9931 -0.0231
    Comp3 | -0.7053 -0.0491 0.7767
    --------------------------------------------

    . estat loadings

    Principal component loadings
    component normalization: sum of squares(column) = 1

    --------------------------------------------
    | Comp1 Comp2 Comp3
    -------------+------------------------------
    Vraag4contr | .3808 -.06403 -.5154
    Vraag5contr | .4395 -.02825 -.3784
    Vraag6contr | .4135 -.07454 -.1991
    Vraag10comp | .3706 .01487 .5657
    Vraag11comp | .4113 -.04167 .4413
    Vraag12comp | .4136 -.05698 .1517
    Vraag4DTT | .07295 .5911 -.008317
    Vraag5DTT | .08708 .5857 -.1057
    Vraag6DTT | .0161 .5402 .04983
    --------------------------------------------

    . predict pc1 pc2 pc3, score

    Scoring coefficients for oblique promax(3) rotation
    sum of squares(column-loading) = 1

    --------------------------------------------
    Variable | Comp1 Comp2 Comp3
    -------------+------------------------------
    Vraag4contr | 0.6249 -0.0087 -0.0967
    Vraag5contr | 0.5792 0.0262 0.0405
    Vraag6contr | 0.4514 -0.0297 0.1517
    Vraag10comp | -0.0750 0.0234 0.6618
    Vraag11comp | 0.0414 -0.0241 0.6043
    Vraag12comp | 0.2278 -0.0269 0.4005
    Vraag4DTT | -0.0068 0.5951 0.0207
    Vraag5DTT | 0.0661 0.5950 -0.0384
    Vraag6DTT | -0.0813 0.5370 0.0243
    --------------------------------------------

    . estat kmo

    Kaiser-Meyer-Olkin measure of sampling adequacy

    -----------------------
    Variable | kmo
    -------------+---------
    Vraag4contr | 0.7312
    Vraag5contr | 0.7639
    Vraag6contr | 0.8521
    Vraag10comp | 0.7179
    Vraag11comp | 0.7477
    Vraag12comp | 0.8600
    Vraag4DTT | 0.6505
    Vraag5DTT | 0.6439
    Vraag6DTT | 0.7584
    -------------+---------
    Overall | 0.7475
    -----------------------

    As you can see, I reduced the 9 questions I had to 3 components.
    However, Stata has standardized my data, while I wanted to retain the scale from 1 to 7.
    I wonder what I have to do in order to keep the scale of my original questions.


  • #2
    You aren't reducing them to 3 components; you are producing 9 components and selecting 3, which is not the same.

    Scaling to mean zero is what PCA does as a matter of convention. Nothing stops you rescaling PCs to any mean and SD you want, but that is outside the PCA. It is hard to see any real advantage in it.

    Note that there might be point in using the covariance option here, as all variables are on the same scale. That won't itself do what you want either.
    Last edited by Nick Cox; 25 Mar 2016, 04:35.

    Comment


    • #3
      Your question indicates that you may not be aware of the assumption of quasi-continuous variables when using PCA (or factor analysis) although you actually seem to have a set of 7-point Likert items. This in itself need not be a problem (though I expect that W. Buchanan will dissent; see, for example, http://www.statalist.org/forums/foru...the-scale-mean ).

      It may well be that your really want to "reduce the set of variables to three components" (but see the comment of Nick), in that case using PCA is o.k. (although I wonder why you choose mineigen to be 1 - components(3) would have been the more appropriate option - and why you want to allow for correlated components by choosing an oblique rotation).

      In many cases, however, PCA is used although the researcher actually wants to employ exploratory factor analysis (EFA) as a method to identify unobservable latent variables. In that case I would recommend to read the article by Preacher and McCallum (2003) (reference below). They recommend
      1. using factor analysis (e.g. factor with the option ipf) instead of a PCA;
      2. to determine the number of factors by using the parallel analysis criterium instead of a minimum eigenvalue of 1 (in your case a two factor solution would perhaps be more appropriate);
      3. followed by an oblique rotation method (e.g. promax) allowing for correlated factors instead of an orthogonal rotation (e.g. varimax).
      If you consider using fapara (see findit fapara) to determine the number of factors, I recommend that to run a PCA first to determine the number of factors using fapara followed by a "true" factor analysis to determine factor loadings (and factor scores).

      Reference:
      Preacher, K. J. & McCallum, R. C. (2003). Repairing Tom Swift's electric factor analysis machine. Understanding Statistics, 2, 13-43.
      Last edited by Dirk Enzmann; 25 Mar 2016, 11:46.

      Comment

      Working...
      X