Non normality of a variabile in factor analysis

Giorgio Piccitto

Join Date: Oct 2016

Posts: 238
#1

Non normality of a variabile in factor analysis

16 Jun 2017, 06:18

Dear Stata users,

I am struggling with a doubt on the use of a variable.

I need to use a continuous var. (from 0 to 100) in a factor analysis. The problem is that this variable is not at all normally distributed: more than a half of observations have the maximum value (100), while the others less-than-a-half have a variation. Is there any chance to transform the distribution of this variable so to use it in a more proper way in a factor analysis? DO you have any clue about it?

Thanks a lot, best, G,
Tags: None
Bruce Weaver

Join Date: May 2014

Posts: 1139
#2

16 Jun 2017, 06:35

Hello Giorgio. Based on your description, this might be a rare exceptional case where it makes some sense to carve your continuous variable into 2 or more categories (e.g., 100 vs < 100). If there is a reasonable and defensible way to do that, then you could follow the advice on the UCLA page linked below on how to carry out the factor analysis. HTH.
https://stats.idre.ucla.edu/stata/fa...ous-variables/

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 19.5 (Windows)
Comment
Giorgio Piccitto

Join Date: Oct 2016

Posts: 238
#3

16 Jun 2017, 06:46

Dear Bruce,
thanks a lot for your response. Yes, what you propose should work, but I would like to not lose all this information, as it would happen if I dichotomize my variabile.
Do you think there is some other way to deal with this problem, without incurring in a strong loss of information?

Thanks a lot, G,
Comment
Bruce Weaver

Join Date: May 2014

Posts: 1139
#4

16 Jun 2017, 07:07

Hello Giorgio. I too am loath to categorize a continuous variable under most circumstances. But for the situation you describe, I can't really think of any other option. Depending on sample size and how the cases with < 100 are distributed, you could have more than 2 categories, which would help a bit in preventing loss of information. HTH.

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 19.5 (Windows)
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35758
#5

16 Jun 2017, 07:11

I don't think it is especially important that marginal distributions of inputs into factor analysis be normal.

On one hand, transformation can only map a spike in a distribution to the same spike relabelled.

On the other hand, a transformation like squaring or cubing (outcome/100) will reduce left skewness here.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17728
#6

16 Jun 2017, 07:15

Giorgio:
the metric of your variable seems to have a ceiling effect (https://en.wikipedia.org/wiki/Ceilin...ct_(statistics)).
You can try to devide it in more categories (as per Bruce's suggestion, even though we all agree that categorizing a continuous variable is not, in general, a good approach) or to admit that your sample is peculiar in some respects or, again, conclude that your variable needs re-validation.

PS: crossed in the cyberspace with Bruce and Nick's interesting contributions.

Last edited by Carlo Lazzaro; 16 Jun 2017, 07:18.

Kind regards,
Carlo
(Stata 19.0)
Comment
Giorgio Piccitto

Join Date: Oct 2016

Posts: 238
#7

16 Jun 2017, 07:27

Dear Bruce,
thanks a lot, I will think about it and I will let you know.

So far, many many thanks.
Best, G.
Comment
Giorgio Piccitto

Join Date: Oct 2016

Posts: 238
#8

16 Jun 2017, 08:06

Many thanks to all of you for your nice suggestions.

I will think of the most suitable solution for my problem.

Thanks again!
Comment

Announcement

Non normality of a variabile in factor analysis

Comment

Comment

Comment

Comment

Comment

Comment

Comment