Dear Stata users,
I have an unbalanced panel data set on six World Bank governance indicators. Their theoretical range is from 0 to 100.
I would like to create a "Governance quality" index from these six indicators for visualizations purposes using factor analysis . The correlation rate between the indicators varies from 0.66 to 0.94.
This is the code I use to perform the analysis and to create the index.
And here is what I would like to ask some advice on. Based on my understanding of the process and results, creating one index from the six indicators, rather than two or more indices, for example, makes perfect sense in this case. Only one factor has eigenvalue above 1, it explains 98.7 percent of the variance, and according to the factor loadings its highly collinear with each of the six indicators. Is this logic reasonable? Are there any others checks I need to perform to be reasonably confident that one factor is sufficient with these data?
I am using Stata/SE 13.1 on Windows 10 x64.
Thank you!
I have an unbalanced panel data set on six World Bank governance indicators. Their theoretical range is from 0 to 100.
Code:
. su Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- country | 0 year | 4066 2007.684 5.992421 1996 2017 voice | 3936 50.01535 29.02587 0 100 stability | 3901 50.03778 29.05658 0 100 effect | 3889 50.02948 29.02575 0 100 -------------+-------------------------------------------------------- regq | 3889 50.025 29.03567 0 100 law | 3961 50.03459 29.04241 0 100 corruption | 3903 50.04943 29.06459 0 100
Code:
. corr voice stability effect regq law corruption (obs=3828) | voice stabil~y effect regq law corrup~n -------------+------------------------------------------------------ voice | 1.0000 stability | 0.7117 1.0000 effect | 0.7639 0.7078 1.0000 regq | 0.7724 0.6572 0.9307 1.0000 law | 0.8344 0.7971 0.9249 0.8952 1.0000 corruption | 0.7934 0.7776 0.9105 0.8563 0.9379 1.0000
Code:
. factor voice stability effect regq law corruption, factors(1) (obs=3828) Factor analysis/correlation Number of obs = 3828 Method: principal factors Retained factors = 1 Rotation: (unrotated) Number of params = 6 -------------------------------------------------------------------------- Factor | Eigenvalue Difference Proportion Cumulative -------------+------------------------------------------------------------ Factor1 | 4.95288 4.78361 0.9866 0.9866 Factor2 | 0.16927 0.14594 0.0337 1.0203 Factor3 | 0.02333 0.05405 0.0046 1.0249 Factor4 | -0.03072 0.00477 -0.0061 1.0188 Factor5 | -0.03549 0.02353 -0.0071 1.0118 Factor6 | -0.05902 . -0.0118 1.0000 -------------------------------------------------------------------------- LR test: independent vs. saturated: chi2(15) = 3.3e+04 Prob>chi2 = 0.0000 Factor loadings (pattern matrix) and unique variances --------------------------------------- Variable | Factor1 | Uniqueness -------------+----------+-------------- voice | 0.8428 | 0.2896 stability | 0.7933 | 0.3706 effect | 0.9506 | 0.0964 regq | 0.9208 | 0.1521 law | 0.9782 | 0.0432 corruption | 0.9512 | 0.0952 --------------------------------------- . rotate Factor analysis/correlation Number of obs = 3828 Method: principal factors Retained factors = 1 Rotation: orthogonal varimax (Kaiser off) Number of params = 6 -------------------------------------------------------------------------- Factor | Variance Difference Proportion Cumulative -------------+------------------------------------------------------------ Factor1 | 4.95288 . 0.9866 0.9866 -------------------------------------------------------------------------- LR test: independent vs. saturated: chi2(15) = 3.3e+04 Prob>chi2 = 0.0000 Rotated factor loadings (pattern matrix) and unique variances --------------------------------------- Variable | Factor1 | Uniqueness -------------+----------+-------------- voice | 0.8428 | 0.2896 stability | 0.7933 | 0.3706 effect | 0.9506 | 0.0964 regq | 0.9208 | 0.1521 law | 0.9782 | 0.0432 corruption | 0.9512 | 0.0952 --------------------------------------- Factor rotation matrix ----------------------- | Factor1 -------------+--------- Factor1 | 1.0000 ----------------------- . predict index (regression scoring assumed) Scoring coefficients (method = regression; based on varimax rotated factors) ------------------------ Variable | Factor1 -------------+---------- voice | 0.06952 stability | 0.04774 effect | 0.19615 regq | 0.11562 law | 0.43183 corruption | 0.17631 ------------------------ . su index Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- index | 3828 1.44e-10 .9896982 -1.763607 1.790659
I am using Stata/SE 13.1 on Windows 10 x64.
Thank you!
Comment