CFA with non-normality variables

Hao Tao

Join Date: Apr 2015

Posts: 3
#1

CFA with non-normality variables

14 May 2015, 15:56

Dear Statalist members,

I am now doing a CFA with 36 variables. 31 of my variables are 4 point likert scale, and the other 5 are 5 point likert scale variables. These variables violate the univariate, bivariate and multivariate normality assumptions (specifically, the kurtosis ranges from 2 to 5.8, centring around 3). I am now using Stata 13.1. How should I perform the CFA? By the way, if I do an EFA, can i use other ext ration methods except ipf? Really appreciate your kind help!

Haotao
Tags: None
Joseph Coveney

Join Date: Apr 2014

Posts: 4440
#2

14 May 2015, 17:46

You might want to look into the Stata command gsem. It has the ability to fit confirmatory factor analysis (CFA) models with ordered categorical indicator variables.

I'm not sure why you would want to perform exploratory factor analysis (EFA) if you've already got a factor structure that you're ready to assess with CFA. Anyway, although I've never tried it, I believe that you could also perform an EFA using gsem by saturating your model (loading all of the indicators on an ascending tally of factors, arranged all at the first level).

In the past (in the days before gsem), I've tried doing EFA on a long variable list such as yours (k = 36) that was a mix of continuous and ordered categorical indicator variables. I first formed a polychoric correlation matrix using the user-written command polychoric (type search polychoric at Stata's command line) and then forwarding the returned matrix to official Stata factormat. I recall having real difficulties with nonpositive-definite correlation matrices: you've got a similarly long list of ordered-categorical (nonnormal) indicator variables, and it's possible that you might encounter similar difficulties with it. Does prior knowledge or theory help you whittle that list down a little?

Last edited by Joseph Coveney; 14 May 2015, 17:56.
Comment
wbuchanan

Join Date: Mar 2014

Posts: 1362
#3

15 May 2015, 04:07

If you're working with Stata 13 and your data are measured on an ordinal scale you should be able to use the gsem command to model the data more consistently. You'll have to specify the family and link functions, but I would approach it with gsem; if you have a sufficiently large set of observations you could also fit the model with sem using the asymptotic distribution free estimator.
Comment
Hao Tao

Join Date: Apr 2015

Posts: 3
#4

15 May 2015, 10:15

STATA folks, thank you for your kind help! I have a few more questions about it and wish to have your comments and suggestion.

I have 635 observations only. In this case, can i use the asymptotic distribution free estimator? If not, what is the alternative?

I also want to examine difference across groups. It seems gsem does not allow such a comparison. Any solution?

Another question is when i do mindice estimate, STATA does not report any result: each column is empty except the degree of freedom (1s). The EFA result indicates there should be 6 latent variables (I was told to run an EFA first to get the factor structure and then do a CFA to confirm it. According to my theory, there should be 3 dimensions/factors, but the EFA tells me there should be 6 instead: 2 factors for the 1st dimension, 1 for the 2nd dimension, and 3 for the 3rd dimension). However, if i reduce the number of factors to three at least, STATA does give me some results. Why would this happen? Is it because my model does not fit the data well or something else? Is it very common?

Highly appreciate your kind help.

Haotao
Comment

Announcement

CFA with non-normality variables

Comment

Comment

Comment