Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Factor analysis - Using factors in regression as covariates

    Dear all, I would like to know your opinion about the following issue: I have a set of variables theoretically linked to two latent factors.
    I would like to run a factor analysis to test if (and to what extent) they are actually linked to these factors, and then calculating the score of these to factor to use them in a regression model.

    I had a look at the literature, but it is not so clear to me how to proceed.

    Do you have any suggestion (or quotations) in this regard?

    Thanks in advance, best, G.

  • #2
    If the theory is that the variables are theoretically linked to two latent factors, a more faithful model of that theory would be to use structural equations modeling, with two latent variables and appropriate linkages from them to the observed variables.

    Doing a factor analysis and using the factors as regressors is something of an approximation to that, but does not account for measurement error in the same way.

    Comment


    • #3
      To add to Clyde's correct interpretation, sometimes folks do an exploratory factor analysis (i.e., not SEM/GSEM), and then generate factor scores to use in regression - see factor postestimation in the documentation. As Clyde points out, this has problems with measurement error in the scores.

      Comment


      • #4
        To add some specifics, one does confirmatory analysis with structural equation modeling. Because your theory says that you have two latent factors, I think this is probably where you want to start.

        The SEM command is pretty complex. I don't know if you would learn better by reading some of the introductory sections of the manual and then going to the worked examples, or if you'd learn better by examples first.

        Whatever the case, SEM example 3 is how you'd fit a linear SEM on two correlated latent variables (Affective and Cognitive in the example), each with 5 indicators. Chances are your indicators are Likert items, and my impression is that the asymptotic distribution free estimator is better suited to handle ordinal variables (i.e., I'd add , method(adf) to the command). Example 4 goes through goodness of fit statistics.

        As for using the latent factors in a regression model, I think example 9 would be appropriate to read. That example does estimate multiple latent factors and regresses them on each other. Chances are Giorgio's application will be simpler.
        Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

        When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

        Comment

        Working...
        X