Hello. I have a dataset with information on workers and industries. The dataset's key is worker_id-year.
What I want to understand is what percentage of the impact on change_to_eship (my dependent variable) is due to worker-level variables, and what percentage is due to industry-level variables. change_to_eship is a binary variable, which is 1 if a worker transitions to entrepreneurship, and 0 otherewise.
At the worker-level I have the following independent variables: tenure, gender, rganho, schooling_1d, nemp, vn.
The industry-level independent variables: tenure_median, vn_per_employee_median, secondary_education, higher_education , rganho_median, nemp_median, num_firms, and gender_industry.
Year is common for both industry and worker level.
I want to understand what percentage of the effect in change_to_eship is due to: tenure gender rganho schooling_1d nemp vn job_level_1d, and what percentage of the effect is due to tenure_median vn_per_employee_median secondary_education higher_education rganho_median nemp_median num_firms gender_industry.
To achieve this, I constructed 3 Multilevel Logistic Regression Models. The first model, which includes only Worker-level Variables. The second model includes only Industry-level Variables. The third model includes both Worker-level and Industry-level Variables.
The icc of Model 1 was of 0.013664, the icc of Model 2 was of 0.1589, and the icc of Model 3 was of 0.0098919.
From what I can understand, the ICC can be interpreted as the proportion of the total variance in the probability of changing to an entrepreneur that is attributable to industry-level factors. How can it then be that the ICC of Model 1, which includes no industry-level variables is larger than that of model 3, which includes both worker- and industry-level variables? Also, since when only industry-level variables are included, 15.89% of the variance is due to differences between industries, is it ok to state that this indicates that industry-level variables alone explain a significant portion of the variance in the likelihood of transitioning to entrepreneurship?
Thank you very much for any help!
What I want to understand is what percentage of the impact on change_to_eship (my dependent variable) is due to worker-level variables, and what percentage is due to industry-level variables. change_to_eship is a binary variable, which is 1 if a worker transitions to entrepreneurship, and 0 otherewise.
At the worker-level I have the following independent variables: tenure, gender, rganho, schooling_1d, nemp, vn.
The industry-level independent variables: tenure_median, vn_per_employee_median, secondary_education, higher_education , rganho_median, nemp_median, num_firms, and gender_industry.
Year is common for both industry and worker level.
I want to understand what percentage of the effect in change_to_eship is due to: tenure gender rganho schooling_1d nemp vn job_level_1d, and what percentage of the effect is due to tenure_median vn_per_employee_median secondary_education higher_education rganho_median nemp_median num_firms gender_industry.
To achieve this, I constructed 3 Multilevel Logistic Regression Models. The first model, which includes only Worker-level Variables. The second model includes only Industry-level Variables. The third model includes both Worker-level and Industry-level Variables.
Code:
*Model 1 - only Worker-level Variables melogit change_to_eship tenure gender rganho i.schooling_1d nemp vn i.year /// || industry_id:, vce(cluster industry_id) estat icc *Model 2 - only Industry-level Variables melogit change_to_eship tenure_median vn_per_employee_median secondary_education higher_education /// rganho_median nemp_median num_firms gender_industry i.year /// || industry_id:, vce(cluster industry_id) estat icc *Model 3 - both worker and industry-level variables melogit change_to_eship tenure gender rganho i.schooling_1d nemp vn job_level_1d tenure_median /// vn_per_employee_median secondary_education higher_education rganho_median nemp_median num_firms /// gender_industry i.year || industry_id:, vce(cluster industry_id) estat icc
The icc of Model 1 was of 0.013664, the icc of Model 2 was of 0.1589, and the icc of Model 3 was of 0.0098919.
From what I can understand, the ICC can be interpreted as the proportion of the total variance in the probability of changing to an entrepreneur that is attributable to industry-level factors. How can it then be that the ICC of Model 1, which includes no industry-level variables is larger than that of model 3, which includes both worker- and industry-level variables? Also, since when only industry-level variables are included, 15.89% of the variance is due to differences between industries, is it ok to state that this indicates that industry-level variables alone explain a significant portion of the variance in the likelihood of transitioning to entrepreneurship?
Thank you very much for any help!
Comment