Hello everyone,
I have a problem with a regression analysis. This is the first time that I work with stata so maybe this is a quite easy question.
My research question is which variables have an impact on the preferences for small or large companies of students after their graduation ("which size of company would you prefer afetr graduation?". i have a panel dataset. the survey was conducted 5 times.
i identified several characteristics that describe small (UG_W = 0) and large (UG_W = 1) companies. these attributes are career possibilities, solcial relationships and compensation. the students had to rate how important the attributes are to them in a work related context.
For my work I identified single items which reflect the attributes the best: opportunity for advancement, social relationsships at the workplace, high wage
I also pooled some of the single items to cover more relevant aspects:
1) career possibilities: career_W = (opportunity for career advancement + managerial responsibility + opportunity for professional training)/3
2) social relationships: social_W = (social relationsships at the workplace + teamwork + work life balance)/3
3) compensation: payment_W = (high wage + high social benefits + additional benefits)/3
the reliability coefficient for thepooled job attributes is around 0.7 which is a good value.
If I run a regression first with the single items, not every coefficient is significant.
Single Items (SI) : xtlogit UG_W career_pos_W social_rel_W high_wage_W, re
If I run a regression with the pooled variables, not every coefficient is significant.
Multi Items (MI): xtlogit UG_W career_W social_W payment_W, re
But if i mix the variables i get a good fit.
Mixed: xtlogit UG_W career_pos_W social_rel_W payment_W, re
My question is whether there are reasons why it is not allowed to mix pooled with single item variables in a regression.
Thank you very much!
Best Katharina
I have a problem with a regression analysis. This is the first time that I work with stata so maybe this is a quite easy question.
My research question is which variables have an impact on the preferences for small or large companies of students after their graduation ("which size of company would you prefer afetr graduation?". i have a panel dataset. the survey was conducted 5 times.
i identified several characteristics that describe small (UG_W = 0) and large (UG_W = 1) companies. these attributes are career possibilities, solcial relationships and compensation. the students had to rate how important the attributes are to them in a work related context.
For my work I identified single items which reflect the attributes the best: opportunity for advancement, social relationsships at the workplace, high wage
I also pooled some of the single items to cover more relevant aspects:
1) career possibilities: career_W = (opportunity for career advancement + managerial responsibility + opportunity for professional training)/3
2) social relationships: social_W = (social relationsships at the workplace + teamwork + work life balance)/3
3) compensation: payment_W = (high wage + high social benefits + additional benefits)/3
the reliability coefficient for thepooled job attributes is around 0.7 which is a good value.
If I run a regression first with the single items, not every coefficient is significant.
Single Items (SI) : xtlogit UG_W career_pos_W social_rel_W high_wage_W, re
If I run a regression with the pooled variables, not every coefficient is significant.
Multi Items (MI): xtlogit UG_W career_W social_W payment_W, re
But if i mix the variables i get a good fit.
Mixed: xtlogit UG_W career_pos_W social_rel_W payment_W, re
My question is whether there are reasons why it is not allowed to mix pooled with single item variables in a regression.
Thank you very much!
Best Katharina
Comment