Dear Stata users,
I have an unbalanced panel data set on six World Bank governance indicators. Their theoretical range is from 0 to 100.
I would like to create a "Governance quality" index from these six indicators for visualizations purposes using factor analysis . The correlation rate between the indicators varies from 0.66 to 0.94.
This is the code I use to perform the analysis and to create the index.
And here is what I would like to ask some advice on. Based on my understanding of the process and results, creating one index from the six indicators, rather than two or more indices, for example, makes perfect sense in this case. Only one factor has eigenvalue above 1, it explains 98.7 percent of the variance, and according to the factor loadings its highly collinear with each of the six indicators. Is this logic reasonable? Are there any others checks I need to perform to be reasonably confident that one factor is sufficient with these data?
I am using Stata/SE 13.1 on Windows 10 x64.
Thank you!
I have an unbalanced panel data set on six World Bank governance indicators. Their theoretical range is from 0 to 100.
Code:
. su
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
country | 0
year | 4066 2007.684 5.992421 1996 2017
voice | 3936 50.01535 29.02587 0 100
stability | 3901 50.03778 29.05658 0 100
effect | 3889 50.02948 29.02575 0 100
-------------+--------------------------------------------------------
regq | 3889 50.025 29.03567 0 100
law | 3961 50.03459 29.04241 0 100
corruption | 3903 50.04943 29.06459 0 100
Code:
. corr voice stability effect regq law corruption
(obs=3828)
| voice stabil~y effect regq law corrup~n
-------------+------------------------------------------------------
voice | 1.0000
stability | 0.7117 1.0000
effect | 0.7639 0.7078 1.0000
regq | 0.7724 0.6572 0.9307 1.0000
law | 0.8344 0.7971 0.9249 0.8952 1.0000
corruption | 0.7934 0.7776 0.9105 0.8563 0.9379 1.0000
Code:
. factor voice stability effect regq law corruption, factors(1)
(obs=3828)
Factor analysis/correlation Number of obs = 3828
Method: principal factors Retained factors = 1
Rotation: (unrotated) Number of params = 6
--------------------------------------------------------------------------
Factor | Eigenvalue Difference Proportion Cumulative
-------------+------------------------------------------------------------
Factor1 | 4.95288 4.78361 0.9866 0.9866
Factor2 | 0.16927 0.14594 0.0337 1.0203
Factor3 | 0.02333 0.05405 0.0046 1.0249
Factor4 | -0.03072 0.00477 -0.0061 1.0188
Factor5 | -0.03549 0.02353 -0.0071 1.0118
Factor6 | -0.05902 . -0.0118 1.0000
--------------------------------------------------------------------------
LR test: independent vs. saturated: chi2(15) = 3.3e+04 Prob>chi2 = 0.0000
Factor loadings (pattern matrix) and unique variances
---------------------------------------
Variable | Factor1 | Uniqueness
-------------+----------+--------------
voice | 0.8428 | 0.2896
stability | 0.7933 | 0.3706
effect | 0.9506 | 0.0964
regq | 0.9208 | 0.1521
law | 0.9782 | 0.0432
corruption | 0.9512 | 0.0952
---------------------------------------
. rotate
Factor analysis/correlation Number of obs = 3828
Method: principal factors Retained factors = 1
Rotation: orthogonal varimax (Kaiser off) Number of params = 6
--------------------------------------------------------------------------
Factor | Variance Difference Proportion Cumulative
-------------+------------------------------------------------------------
Factor1 | 4.95288 . 0.9866 0.9866
--------------------------------------------------------------------------
LR test: independent vs. saturated: chi2(15) = 3.3e+04 Prob>chi2 = 0.0000
Rotated factor loadings (pattern matrix) and unique variances
---------------------------------------
Variable | Factor1 | Uniqueness
-------------+----------+--------------
voice | 0.8428 | 0.2896
stability | 0.7933 | 0.3706
effect | 0.9506 | 0.0964
regq | 0.9208 | 0.1521
law | 0.9782 | 0.0432
corruption | 0.9512 | 0.0952
---------------------------------------
Factor rotation matrix
-----------------------
| Factor1
-------------+---------
Factor1 | 1.0000
-----------------------
. predict index
(regression scoring assumed)
Scoring coefficients (method = regression; based on varimax rotated factors)
------------------------
Variable | Factor1
-------------+----------
voice | 0.06952
stability | 0.04774
effect | 0.19615
regq | 0.11562
law | 0.43183
corruption | 0.17631
------------------------
. su index
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
index | 3828 1.44e-10 .9896982 -1.763607 1.790659
I am using Stata/SE 13.1 on Windows 10 x64.
Thank you!

Comment