Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using predicted variables from principal component analysis (PCA) for an exploratory factor analysis (EFA)

    Hello everybody,

    as described in my topic, I want to use predicted variables from PCA for an EFA.

    What I did so far and what is planned:
    1. I cut my dataset into 4 sets
    2. I used PCA to trim down from my ~180 variables (I now have 48 items and 9 components describing those)
    3. Now I want to use these 9 components and implement these into my second set for an EFA
    4. After finding the underlying structure I want to implement these findings on a third set to regress a probit model
    5. The last set is for running the probit model.
    Now:
    I am somewhat stuck on how to move my results from one set to another.
    I did use "predict pc1 pc2 pc3 pc4 pc5 pc6 pc7 pc8 pc9, score" to get the new variables from my components I found via the PCA, however, since the second dataset will have completely new data I am not sure what the correct way of implementing the predictors are...

    The data is "name of bank" and various different indicators of their profitability or employee count etc.
    Therefore, my pc1-9 cant be simply copied onto the new set (the "names of banks" and their respective values on the variables are completely different)


    So how can I proceed?



    Thank you in advance!

    Aaron

  • #2
    After you -predict- the pc1-pc9 variables in the first data set, -predict- leaves behind a matrix of scoring coefficients in r(scoef). Grab that matrix, and then you can use it to apply the same scoring to the new data in the second data set just by matrix multiplication.

    Comment


    • #3
      Hello there,

      so I would use "factormat" on that matrix of scoring coefficients in my second set instead of "factor"?

      Comment


      • #4
        Not if I understand correctly what you want to do. -factormat- would be applied to a covariance matrix of the data in dataset 2, not to r(scoef). But you don't want to factor the data in data set 2. You want to first calculate the 9 components and then factor those 9 variables, right? That's what I understood from #1.

        So, if I have that right, you would multiply the data in dataset 2 by the matrix in r(scoef), thereby calculating values for the 9 components from the data in data set2. Then you would just -factor- the 9 component variables from data set 2.

        Comment


        • #5
          Thank you a lot, I think I understand the procedure.

          Comment


          • #6
            Ok so I did the following

            . matrix define pca = r(scoef)

            . matrix list pca

            and yes, I did store it correctly.


            Now how do I multiply it with my new data correctly?
            I already opened my new data but I am unsure how to properly multiply the two inputs correctly (data and matrix)

            Can you help me with this operation?

            Comment


            • #7
              -help mkmat- And at the end you will also need -help svmat- -mkmat- will enable you to create a Stata matrix out of the variables that are inputs to the principal components analysis. You will then right multiply that matrix by the matrix pca that you created in #6. The result will be a new matrix containing the data for the 9 components. Then with -svmat- you can put those results back into the data set.

              Comment


              • #8
                Thank you a lot, I implemented everything.

                Comment

                Working...
                X