Dear Stata users (esp Stan Kolenikov),
This is my first time, so please excuse any errors in admission.
I am using Windows StataSE 13.
I have a series of ordinal variables that I want to run an exploratory factor analysis (EFA) on and then use the results to predict factor scores. Obtaining the factor scores is very important.
I have calculated the polychoric correlation matrix and used the factormat command to run the EFA. I was able to calculate factor scores using the predict command however the means and standard deviation were not 0 and 1 respectively.
On closer reading of the Stata manual I came across this from the "mv.pdf" on page 303
"sds(matname2) specifies a k 1 or 1k matrix with the standard deviations of the variables. The row or column names should match the variable names, unless the names() option is specified. sds() may be specified only if matname is a correlation matrix. Specify sds() if you have variables in your dataset and want to use predict after factormat. sds() does not affect the computations of factormat but provides information so that predict does not assume that the standard deviations are one.
It is a similar statement with regards to means.
Following this advice a simplified version of my code is as follows:
* This calculates the polychoric correlation matrix and saves it as a matrix
polychoric varlist [aweight = var1] if(var2 ==1)
global N = r(N)
matrix r = r(R)
matrix list r
* Run EFA using the polychoric correlation matrix
factormat r, n($N) factors(1) blanks (.32) pf sds(sdev) means(ms)
predict score if (var2==1), regression
summarize score [aweight = var1] if (var2 ==1)
sdev and ms are standard deviation and mean vectors created from running the tabstat command with the above weights and if statements
My question is if factor scores can be created this way from a polychoric correlation matrix? In other words, is this correct or am I missing something related to the mathematics involved in estimating correlation coefficients and then predicting factor scores.
Please any advice would be helpful. I can't find any available articles or blogs / forums on it as it relates to Stata.
This is my first time, so please excuse any errors in admission.
I am using Windows StataSE 13.
I have a series of ordinal variables that I want to run an exploratory factor analysis (EFA) on and then use the results to predict factor scores. Obtaining the factor scores is very important.
I have calculated the polychoric correlation matrix and used the factormat command to run the EFA. I was able to calculate factor scores using the predict command however the means and standard deviation were not 0 and 1 respectively.
On closer reading of the Stata manual I came across this from the "mv.pdf" on page 303
"sds(matname2) specifies a k 1 or 1k matrix with the standard deviations of the variables. The row or column names should match the variable names, unless the names() option is specified. sds() may be specified only if matname is a correlation matrix. Specify sds() if you have variables in your dataset and want to use predict after factormat. sds() does not affect the computations of factormat but provides information so that predict does not assume that the standard deviations are one.
It is a similar statement with regards to means.
Following this advice a simplified version of my code is as follows:
* This calculates the polychoric correlation matrix and saves it as a matrix
polychoric varlist [aweight = var1] if(var2 ==1)
global N = r(N)
matrix r = r(R)
matrix list r
* Run EFA using the polychoric correlation matrix
factormat r, n($N) factors(1) blanks (.32) pf sds(sdev) means(ms)
predict score if (var2==1), regression
summarize score [aweight = var1] if (var2 ==1)
sdev and ms are standard deviation and mean vectors created from running the tabstat command with the above weights and if statements
My question is if factor scores can be created this way from a polychoric correlation matrix? In other words, is this correct or am I missing something related to the mathematics involved in estimating correlation coefficients and then predicting factor scores.
Please any advice would be helpful. I can't find any available articles or blogs / forums on it as it relates to Stata.