Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using PCA to make an index for a new variable

    Hello,

    My supervisor told me to make an PCA index for Hofstede's six cultural dimensions (pdi, idv, mas, ltowvs and ivr). In stata I typed:
    pca pdi idv mas uai ltowvs ivr

    and got this as result (see picture). But now I do no know if I have to use all six component or just one when using the predict command. Can somebody help me?




    Attached Files

  • #2
    I am not responsible to your supervisor, so I can opine that this is a dubious strategy. Having got 6 dimensions you can interpret, you now want to mush them together. That's a bit like taking what you have in your refrigerator and putting it all in a blender.

    Sharp prejudices aside,

    Code:
    predict wanted
    will calculate values (scores) for the first PC. Under the terms of the game, the first PC is by definition the best one to serve as a summary. You can't improve on it by mushing in bits of other PCs.

    That side, the loadings suggest that

    (pdi + uai + ltowvs) - (idv + mas + ivr)

    might serve just as well as a summary: three variables load positively on PC1 and three negatively and the loadings are roughly of equal size. So a simple average respecting signs is implied. (Making it an average would just be arbitrary scaling.)

    Your question seems to imply that you haven't really read up on PCA, a technique I find interesting but rarely useful. I have never found a book I like on it as all authors of PCA texts are uncritically positive about it (you would hardly write one otherwise).

    But https://www.cambridge.org/core/books...303427732D6ABD is tough but rewarding and strikes an appropriately critical tone.

    (EDIT: I am assuming that the 6 dimensions are measured on similar scales.)
    Last edited by Nick Cox; 17 May 2017, 03:37.

    Comment


    • #3
      I'll add a bit to what Nick said.

      In some cases, mushing it all together may not be the craziest thing to do. Health-related quality of life (HRQoL) is measured on multiple dimensions, e.g. physical health, mental health, social roles impaired due to either. You can imagine that there's one overarching construct, though. If you had an instrument with a few questions per domain, and you took a sum score to be the person's overall HRQoL, that's not crazy.

      For Hofstede's 6 dimensions, what's the overarching construct? I can't imagine one. Did your supervisor tell you what this underlying construct is? I've been asked before to use a statistical technique where I wasn't quite sure what I was doing, but at least my professor knew what she was doing. Problem is, too many people don't quite understand what a technique means or does.
      Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

      When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

      Comment

      Working...
      X