Hello everyone. Thanks in advance to the forum users because very often it is a great help. I am a beginner in using Stata.
I am writing because I am having some problems in the construction of an index on the quality of the rule of law by regions.
Basically I have 5 variables and I want to summarize them in a single variable which would be the index. To do this aggregation I first need to find a weight for each variable. Studying the literature it seems that a good method is by using the Principal Components as a weight. So basically estimate the principal components of each variable, multiply this weight for each observation and finally aggregate (sum) them.
The first problem: I have missing values, to solve this problem I am using the multiple imputation and then proceed with the analysis of the principal components. I'm trying to fix the "invalid file" error in the screenshot below (invalid '"C:\Users\........\index_results') but can't figure out if it's syntax or what. Could you kindly help me? And in general give me an opinion on the methodology I am using and any suggestions?
This is the code I am currently using:
I am writing because I am having some problems in the construction of an index on the quality of the rule of law by regions.
Basically I have 5 variables and I want to summarize them in a single variable which would be the index. To do this aggregation I first need to find a weight for each variable. Studying the literature it seems that a good method is by using the Principal Components as a weight. So basically estimate the principal components of each variable, multiply this weight for each observation and finally aggregate (sum) them.
The first problem: I have missing values, to solve this problem I am using the multiple imputation and then proceed with the analysis of the principal components. I'm trying to fix the "invalid file" error in the screenshot below (invalid '"C:\Users\........\index_results') but can't figure out if it's syntax or what. Could you kindly help me? And in general give me an opinion on the methodology I am using and any suggestions?
This is the code I am currently using:
Code:
clear capture log close import excel "C:\Users\.......\PANEL.xlsx", sheet("panel") firstrow drop if year==. mi set mlong mi xtset, clear mi register imputed reportedcaught endofthesentence clearance_rate_civils collection_capacity tax_gap mi impute mvn reportedcaught endofthesentence clearance_rate_civils collection_capacity tax_gap, add(20) replace log using "C:\Users\......\log.log", replace mi estimate, cmdok saving ("C:\Users\.......\index_results", replace): pca reportedcaught endofthesentence clearance_rate_civils collection_capacity tax_gap, components(1) covariance vce(normal) mi predict using ("C:\Users\........\index_results") save "C:\Users\........\dataset_imputed", replace log close
Comment