Using predicted variables from principal component analysis (PCA) for an exploratory factor analysis (EFA)

Aaron Nagel

Join Date: Sep 2019

Posts: 11
#1

Using predicted variables from principal component analysis (PCA) for an exploratory factor analysis (EFA)

28 Sep 2019, 14:25

Hello everybody,

as described in my topic, I want to use predicted variables from PCA for an EFA.

What I did so far and what is planned:
I cut my dataset into 4 sets

I used PCA to trim down from my ~180 variables (I now have 48 items and 9 components describing those)

Now I want to use these 9 components and implement these into my second set for an EFA

After finding the underlying structure I want to implement these findings on a third set to regress a probit model

The last set is for running the probit model.

Now:
I am somewhat stuck on how to move my results from one set to another.
I did use "predict pc1 pc2 pc3 pc4 pc5 pc6 pc7 pc8 pc9, score" to get the new variables from my components I found via the PCA, however, since the second dataset will have completely new data I am not sure what the correct way of implementing the predictors are...

The data is "name of bank" and various different indicators of their profitability or employee count etc.
Therefore, my pc1-9 cant be simply copied onto the new set (the "names of banks" and their respective values on the variables are completely different)

So how can I proceed?

Thank you in advance!

Aaron
Tags: data, EFA, pca, predict
Clyde Schechter

Join Date: Apr 2014

Posts: 30083
#2

28 Sep 2019, 14:48

After you -predict- the pc1-pc9 variables in the first data set, -predict- leaves behind a matrix of scoring coefficients in r(scoef). Grab that matrix, and then you can use it to apply the same scoring to the new data in the second data set just by matrix multiplication.
Comment
Aaron Nagel

Join Date: Sep 2019

Posts: 11
#3

28 Sep 2019, 14:52

Hello there,

so I would use "factormat" on that matrix of scoring coefficients in my second set instead of "factor"?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30083
#4

28 Sep 2019, 15:00

Not if I understand correctly what you want to do. -factormat- would be applied to a covariance matrix of the data in dataset 2, not to r(scoef). But you don't want to factor the data in data set 2. You want to first calculate the 9 components and then factor those 9 variables, right? That's what I understood from #1.

So, if I have that right, you would multiply the data in dataset 2 by the matrix in r(scoef), thereby calculating values for the 9 components from the data in data set2. Then you would just -factor- the 9 component variables from data set 2.
Comment
Aaron Nagel

Join Date: Sep 2019

Posts: 11
#5

28 Sep 2019, 17:23

Thank you a lot, I think I understand the procedure.
Comment
Aaron Nagel

Join Date: Sep 2019

Posts: 11
#6

28 Sep 2019, 18:31

Ok so I did the following

. matrix define pca = r(scoef)

. matrix list pca

and yes, I did store it correctly.

Now how do I multiply it with my new data correctly?
I already opened my new data but I am unsure how to properly multiply the two inputs correctly (data and matrix)

Can you help me with this operation?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30083
#7

28 Sep 2019, 18:38

-help mkmat- And at the end you will also need -help svmat- -mkmat- will enable you to create a Stata matrix out of the variables that are inputs to the principal components analysis. You will then right multiply that matrix by the matrix pca that you created in #6. The result will be a new matrix containing the data for the 9 components. Then with -svmat- you can put those results back into the data set.
Comment
Aaron Nagel

Join Date: Sep 2019

Posts: 11
#8

28 Sep 2019, 21:06

Thank you a lot, I implemented everything.
Comment

Announcement

Using predicted variables from principal component analysis (PCA) for an exploratory factor analysis (EFA)

Comment

Comment

Comment

Comment

Comment

Comment

Comment