Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Factor Analysis

    Hello, I am using 14.2 to attempt a factor analysis. I used the survey data to conduct a mediation analysis and the results were unexpected. So I decided to explore the survey instrument itself so I could better understand the results I developed.

    I have read the manual and Acock's section in the Stata text on factor analysis. I am examining a state wide survey given by our state department of education to teachers that asks questions about their working conditions.

    My question is if folks who be kind enough to look at my attached output and provide some feedback. I ran a factor analysis using ipf as pcf was not at all appropriate and ml yielded a Haywood (sic) case warning. I think from the factor analysis in the first table, the first three factors account for 100% of the variance. I decided to only display results that have a loading greater than .3 and you can see the results in the subsequent table. I rotated (using default as this I assumed was the best given the assumption of correlation among the items) the outcomes using blanks .4 to ease interpretation and the results seem to show that there are 3 distinct factors that group all but one question. the uniqueness value for this one is high so maybe the other factors don't explain the variance of this item?

    Well in brief is does this make sense what I did?

    Thanks for your time and if more is needed let me know.

    Ted
    Attached Files

  • #2
    You'll increase your chances of a helpful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stata output and sample data using dataex. Many of us will not open files due to virus concerns. Also, pictures are not very helpful since we can't put them in Stata.

    You don't talk about whether you're using exploratory or confirmatory factor analysis, and don't tell us exactly what you have run. If you're doing mediation you probably need confirmatory, but your discussion of rotation suggests exploratory. Getting three factors explaining 100% of the variance would worry me. So, you need to start with what exactly you're trying to do - exploratory or confirmatory work.

    Comment


    • #3
      I will just add that your output can and should be posted in code delimiters. Many readers aren't willing to open Word, Excel, or PDF attachments due to concerns about viruses. It's not you, it's the times we live in. Put your results in a post using code delimiters (# button on the formatting toolbar, between the double quote and the <> buttons) and you'll get a better response.
      Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

      When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

      Comment


      • #4
        Thanks I will try again with a new post.

        Comment


        • #5
          Don't need to make a new post, just reply to this one!
          Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

          When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

          Comment


          • #6
            Thanks again for the help and apologies for not reading FAQ.

            Background - Every two years the state of North Carolina administers a teacher working conditions survey to all schools. The survey has been given since 2002 and modified each administration. Currently there are 78 questions grouped into 8 categories or themes. I ran a mediation analysis using three of the categories and the results were not what i expected. Not implying that they were not correct but I wanted to explore the survey more to determine if I could better understand the results I developed. So I decided to run an exploratory factor analysis on the groups of questions I used in the mediation analysis. I will include the results from the first analysis I attempted. The following items are from the group called "Administrative Leadership".

            Code:
            input byte(Trust Concern Support Prostds Usedata Tchperf Feedback EvalConsis SIPeffective RewrdRecog)
            4 4 4 4 4 4 4 4 4 3
            4 4 4 4 4 4 4 4 4 4
            4 4 4 4 4 4 4 4 4 4
            3 3 4 3 4 4 3 3 3 2
            3 2 2 4 4 3 3 3 3 3
            3 3 3 3 3 3 3 3 3 3
            3 3 3 3 3 3 3 3 3 5
            2 3 3 3 3 3 3 3 3 3
            3 3 3 3 3 3 3 3 3 3
            3 3 3 3 3 3 3 3 3 3
            Code:
              factor Trust Concern Support Prostds Usedata Tchperf Feedback EvalConsis SIPeffective RewrdRecog if NC16_dma050position ==1
            (obs=83,140)
            
            Factor analysis/correlation                      Number of obs    =     83,140
                Method: principal factors                    Retained factors =          4
                Rotation: (unrotated)                        Number of params =         34
            
                --------------------------------------------------------------------------
                     Factor  |   Eigenvalue   Difference        Proportion   Cumulative
                -------------+------------------------------------------------------------
                    Factor1  |      5.13396      4.65904            0.9761       0.9761
                    Factor2  |      0.47492      0.24543            0.0903       1.0664
                    Factor3  |      0.22949      0.20018            0.0436       1.1100
                    Factor4  |      0.02931      0.07905            0.0056       1.1156
                    Factor5  |     -0.04974      0.02505           -0.0095       1.1061
                    Factor6  |     -0.07480      0.01348           -0.0142       1.0919
                    Factor7  |     -0.08827      0.02368           -0.0168       1.0751
                    Factor8  |     -0.11195      0.01135           -0.0213       1.0538
                    Factor9  |     -0.12330      0.03645           -0.0234       1.0304
                   Factor10  |     -0.15975            .           -0.0304       1.0000
                --------------------------------------------------------------------------
                LR test: independent vs. saturated:  chi2(45) = 4.6e+05 Prob>chi2 = 0.0000
            
            Factor loadings (pattern matrix) and unique variances
            
                ---------------------------------------------------------------------
                    Variable |  Factor1   Factor2   Factor3   Factor4 |   Uniqueness 
                -------------+----------------------------------------+--------------
                       Trust |   0.7706   -0.3361   -0.0001   -0.0402 |      0.2916  
                     Concern |   0.7722   -0.3504   -0.0379   -0.0249 |      0.2788  
                     Support |   0.7863   -0.2063    0.0180    0.0048 |      0.3388  
                     Prostds |   0.6260    0.1464    0.2671   -0.0508 |      0.5128  
                     Usedata |   0.6259    0.2075    0.2566   -0.0242 |      0.4988  
                     Tchperf |   0.7456    0.2072   -0.1764   -0.0400 |      0.3684  
                    Feedback |   0.7662    0.1855   -0.1015   -0.0062 |      0.3682  
                  EvalConsis |   0.7347    0.2222   -0.2138   -0.0114 |      0.3651  
                SIPeffective |   0.6182    0.0616    0.0416    0.1140 |      0.5993  
                  RewrdRecog |   0.6906   -0.0400    0.0407    0.0955 |      0.5106  
                ---------------------------------------------------------------------
            
            . rotate, blanks(.4)
            
            Factor analysis/correlation                      Number of obs    =     83,140
                Method: principal factors                    Retained factors =          4
                Rotation: orthogonal varimax (Kaiser off)    Number of params =         34
            
                --------------------------------------------------------------------------
                     Factor  |     Variance   Difference        Proportion   Cumulative
                -------------+------------------------------------------------------------
                    Factor1  |      2.67942      0.55783            0.5094       0.5094
                    Factor2  |      2.12159      1.11350            0.4034       0.9128
                    Factor3  |      1.00808      0.94949            0.1917       1.1044
                    Factor4  |      0.05859            .            0.0111       1.1156
                --------------------------------------------------------------------------
                LR test: independent vs. saturated:  chi2(45) = 4.6e+05 Prob>chi2 = 0.0000
            
            Rotated factor loadings (pattern matrix) and unique variances
            
                ---------------------------------------------------------------------
                    Variable |  Factor1   Factor2   Factor3   Factor4 |   Uniqueness 
                -------------+----------------------------------------+--------------
                       Trust |   0.7752                               |      0.2916  
                     Concern |   0.7866                               |      0.2788  
                     Support |   0.6901                               |      0.3388  
                     Prostds |                       0.5264           |      0.5128  
                     Usedata |                       0.5376           |      0.4988  
                     Tchperf |             0.6699                     |      0.3684  
                    Feedback |             0.6285                     |      0.3682  
                  EvalConsis |             0.6911                     |      0.3651  
                SIPeffective |                                        |      0.5993  
                  RewrdRecog |   0.5001                               |      0.5106  
                ---------------------------------------------------------------------
                (blanks represent abs(loading)<.4)
            
            Factor rotation matrix
            
                --------------------------------------------------
                             | Factor1  Factor2  Factor3  Factor4 
                -------------+------------------------------------
                     Factor1 |  0.6880   0.6071   0.3904   0.0750 
                     Factor2 | -0.7247   0.5864   0.3614   0.0189 
                     Factor3 | -0.0113  -0.5353   0.8433   0.0463 
                     Factor4 | -0.0375  -0.0320  -0.0754   0.9959 
                --------------------------------------------------
            So what I think I see is that the first three questions in the section account for nearly all the variance. The rotation seems to me to confirm this when I limit the size of the results to .4 or greater. I did not use pcf since the uniqueness was not zero. I did run this with ipf and the results were nearly identical. I realize I could run a CFA using SEM but that would not have given me the above naturally.

            Again thanks for the help and support.

            Ted

            Comment


            • #7
              Ted,

              You're not wrong to check this. Just because someone wrote up a survey and thought it was unidimensional, that doesn't mean it always will be (or that it was to begin with).

              However, your factor analysis is actually consistent with one dominant factor. Going back to your factor extraction, your first factor accounts for 97.6% of the variance. The next 3 account for single digit percents. When you asked Stata to rotate the factors, it went and rotated all factors with positive eigenvalues. That's fine, but again, 3 of the factors accounted for very little variance. Also, in the unrotated solution, you can see that all questions have acceptable loading on factor 1. There appears to be some cross loading (first 3 items load at ~-0.3 on factor 2), but this is OK.

              If you want to improve your factor analysis interpretation skills, I've usually used oblique rotation, not orthogonal. Many domains of a survey will tend to have substantial correlation. Also, most survey questions are Likert items, and factor analysis assumes that the variables are all multivariate normally distributed. Try reading this page for an alternative. You'll have to download a user-written program, but that is simple to do.
              Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

              When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

              Comment


              • #8
                Wiewen,

                Thanks for the help and insight. I will use the polychoric ado. I did see that the first factor accounted for 97.6 % of the variance but thought that seemed high and wanted to dive a little bit into factor analysis. Again I really appreciate your time.

                Comment

                Working...
                X