Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Non normality of a variabile in factor analysis

    Dear Stata users,

    I am struggling with a doubt on the use of a variable.

    I need to use a continuous var. (from 0 to 100) in a factor analysis. The problem is that this variable is not at all normally distributed: more than a half of observations have the maximum value (100), while the others less-than-a-half have a variation. Is there any chance to transform the distribution of this variable so to use it in a more proper way in a factor analysis? DO you have any clue about it?

    Thanks a lot, best, G,

  • #2
    Hello Giorgio. Based on your description, this might be a rare exceptional case where it makes some sense to carve your continuous variable into 2 or more categories (e.g., 100 vs < 100). If there is a reasonable and defensible way to do that, then you could follow the advice on the UCLA page linked below on how to carry out the factor analysis. HTH.
    --
    Bruce Weaver
    Email: [email protected]
    Version: Stata/MP 19.5 (Windows)

    Comment


    • #3
      Dear Bruce,
      thanks a lot for your response. Yes, what you propose should work, but I would like to not lose all this information, as it would happen if I dichotomize my variabile.
      Do you think there is some other way to deal with this problem, without incurring in a strong loss of information?

      Thanks a lot, G,

      Comment


      • #4
        Hello Giorgio. I too am loath to categorize a continuous variable under most circumstances. But for the situation you describe, I can't really think of any other option. Depending on sample size and how the cases with < 100 are distributed, you could have more than 2 categories, which would help a bit in preventing loss of information. HTH.
        --
        Bruce Weaver
        Email: [email protected]
        Version: Stata/MP 19.5 (Windows)

        Comment


        • #5
          I don't think it is especially important that marginal distributions of inputs into factor analysis be normal.

          On one hand, transformation can only map a spike in a distribution to the same spike relabelled.

          On the other hand, a transformation like squaring or cubing (outcome/100) will reduce left skewness here.

          Comment


          • #6
            Giorgio:
            the metric of your variable seems to have a ceiling effect (https://en.wikipedia.org/wiki/Ceilin...ct_(statistics)).
            You can try to devide it in more categories (as per Bruce's suggestion, even though we all agree that categorizing a continuous variable is not, in general, a good approach) or to admit that your sample is peculiar in some respects or, again, conclude that your variable needs re-validation.

            PS: crossed in the cyberspace with Bruce and Nick's interesting contributions.
            Last edited by Carlo Lazzaro; 16 Jun 2017, 07:18.
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              Dear Bruce,
              thanks a lot, I will think about it and I will let you know.

              So far, many many thanks.
              Best, G.

              Comment


              • #8
                Many thanks to all of you for your nice suggestions.

                I will think of the most suitable solution for my problem.

                Thanks again!

                Comment

                Working...
                X