Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Principal Component Analysis selecting country and questions under conditions

    Hello dear statalisters,

    I am working with the World Value Survey that have several questions for different countries.

    What I need to do is to performe a principal components analysis, combining all respondents for one country, let's say Australia, then only another country like Austria. So, for each country I need to perform the pca, but only for three questions, questions V86, V87 y V88; but I need to include only observations with at least one answer "1" in any of the three questions (in a scale from -2 to 4). Then counting the number of activities that respondents had actually done from among the three questions. One person could answer 1 on one, two or the three questions.

    The code that I was testing is:
    Code:
    pca V86 V87 V88 if country==36 | V86==1 | V87==1 | V88==1, components(1)
    but when I see the results, I see that the analysis was done considering all the countries, not only the specific that I need.

    I really appreciate any help with the code.

    Best regards,

    Alejandro

  • #2
    You're mixing up & and | operators.

    The condition that at least one answer was given won't be needed as PCA needs non-missing values on all variables used. Conversely, you'll need imputation first for that strategy to make sense.

    Further, if V86 V87 V88 contain grades -2 to 4 then 1 is a particular grade and not whether a question was answered.

    The benefits of PCA here seem negative: different PCAs for each country and based on just 3 variables? Why not just use the data as they come? If you want to summarize answers for three questions, use the mean or median or something else.

    Comment


    • #3
      Dear Nick,
      Thank you very much for your answer.
      About the conditions, I am mixing the operators trying to replicate "
      Dalton, R., Van Sickle, A., & Weldon, S. (2010). The individual–institutional nexus of protest behaviour.
      British Journal of Political Science, 40 (1), 51-73." They used PCA and calculate a mean for each country. I proposed working with a mean but my supervisor asked me to replicate. Now, in the paper said that "We counted the number of activities that respondents had actually done" and the answer for "actually done" is the number "1", the negative numbers are "no answer" or "Don't know" the others are "Would never do", "Might do".

      Those negative answers are giving me problems too, because when I run the code, sometimes I have negative means, then I am really lost. I need to isolate the country, consider only observations that at least one answer was "1" and trying to fix the negative means too.

      So, Nick, if you have any further suggestion it will be sincerely appreaciate it .

      Best regards,
      Alejandro.

      Last edited by Alejandro Torres; 05 Aug 2019, 10:44.

      Comment


      • #4
        I'd have the same reaction to the paper authors if they posted here about the use of PCA. If your criteria are being in a particular country and that at least one question has an answer of 1 then PCA is still going to treat any other answers as if they were integers to be taken literally.



        Comment


        • #5
          Thank you very much again Nick, let's see what is going to happen.
          Best,
          Alejandro

          Comment

          Working...
          X