Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Factor analysis using “predict” and rescaling factor scores

    Hello,
    I have a question re: rescaling of factor scores.

    I have a set of question items on political efficacy and after running Factor, PCA and generating the factor index using the predict command, I noticed that the range of the index is from negative non-integer value to a positive non-integer value. I understand that the factor scores are measured in units of standard deviations from their means. I would like to rescale it so that the factor score ranges from a set of positive integers. I would normally leave it as is, but I’ve also noticed that calculating with margins or marginsplot, factor variables may not contain noninteger values. Is it advised to rescale factor scores? And any suggestions on how to do so would be appreciated.

  • #2
    As you have noticed, scores generated by -predict- after -factor- or -pca- are standardized to a mean of zero and standard deviation of one. You can linearly transform them any way you like for purposes of presentation to suite your tastes or those of your readers/audience/etc. For purposes of most analyses it usually will make no difference whether or how you transform them or leave them alone. (Some likelihood maximizations will be more numerically stable and converge more easily if all the variables involved are of comparable magnitudes, so if you plan to use these in, say, a model where other predictors have magnitudes very different from the scale of your factor scores, you would be better off transforming them, but that's about all that comes to mind.)

    It is, by the way, highly unlikely that any linear transformation of your factor scores will transform them all into integer values: they rarely are spread out that discretely and evenly. (In fact, if I did a factor analysis and got results that looked like that I would seriously question the underlying data.) To get integer valued transforms almost inevitably requires applying cut-points and creating a categorical variable, a process which discards information and is usually strongly discouraged.

    As for "factor variables may not contain noninteger values," that reference to "factor variables" has nothing to do with the -factor- or -pca- commands or their output. Rather, that is a reference to the automated representation of non-negative integer valued variables as a series of binary indicator variables using the i.varname notation. See [U] 11.4.3 or type -help fvvarlist- for more information.

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      As you have noticed, scores generated by -predict- after -factor- or -pca- are standardized to a mean of zero and standard deviation of one.
      Clyde, do you know exactly what the calculations are behind the predict command? I do believe it demeans the data but it seems that the data is not fully standardized. I checked the standard deviation of an output series from the predict command and found it to be ~1.6. I was trying to manually match the output using the coefficients produced by the pca command and was unable to do so with any sort of data transformation I could come up with.

      Comment

      Working...
      X