Hi everyone,
I have a dataset with individuals within different universities and distributed in five degrees: Biology, Economics, Physics, Psychology and Sociology. Each individual has the possibility to participate in roughly 30 activities which are at university level, from different sports to entrepreurship or volunteering and different arts. The main idea is that an individual who studies Biology can be in the same basketball team, for example, that an individual who studies Sociology, within the same university. I would like to know which are the indirect connections of each individual outside his bachelor's degree, that is, the mates of my bachelor's degree mates who are not my mates. In order to achieve this, I need to match the information of the activities in which I don't participate, but my mates do, and then I substract the information about the other activity mates who study other bachelor's degrees. I provide an example to illustrate this case:
I have individuals' ids, the code of the university (9 in that case), the string variable with the bachelor's degree name, the string variable with the name of every activity each individual takes part in, the code assigned to each activity, and a final variable with gender information, being female equals 1 and 0 otherwise.
What I would like to know, but I am still unable to figure out how, is the average gender of individuals in the same university, outside my bachelor's degree, but with whom others in my bachelor's degree have a direct contact through participation in activities. For example, individual 1, who study Biology in university 9, participates in Volleyball, Baseball and Basketball teams simultaneously. The question I attempt to respond is: What are the average gender of those individuals who belong to different bachelor's degrees and are in contact with at least one of his mates, but not with him? So first, inside Biology, I look at what different activities mates of individual 1 are involved. They are Chorus, Soccer, Football, Track and Math. Outside Biology, the average gender in Chorus, Soccer, Football, Track and Math is, respectively: 1, 0.33, 0, 0.6, and 1. I would like to know this information for each individual.
Any feedback and/or suggestions will be highly appreciated. Thanks a lot in advance for your time.
Best regards,
Daniel
I have a dataset with individuals within different universities and distributed in five degrees: Biology, Economics, Physics, Psychology and Sociology. Each individual has the possibility to participate in roughly 30 activities which are at university level, from different sports to entrepreurship or volunteering and different arts. The main idea is that an individual who studies Biology can be in the same basketball team, for example, that an individual who studies Sociology, within the same university. I would like to know which are the indirect connections of each individual outside his bachelor's degree, that is, the mates of my bachelor's degree mates who are not my mates. In order to achieve this, I need to match the information of the activities in which I don't participate, but my mates do, and then I substract the information about the other activity mates who study other bachelor's degrees. I provide an example to illustrate this case:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input long id int university_code str10 degree str16 activity double activity_code float female 1 9 "biology" "volleyball" 26 1 1 9 "biology" "baseball" 17 1 1 9 "biology" "basketball" 18 1 2 9 "biology" "chorus" 15 0 3 9 "biology" "volleyball" 26 1 3 9 "biology" "baseball" 17 1 4 9 "biology" "soccer" 22 0 4 9 "biology" "football" 20 0 4 9 "biology" "basketball" 18 0 4 9 "biology" "track" 25 0 5 9 "biology" "math" 11 1 5 9 "biology" "baseball" 17 1 6 9 "economics" "baseball" 17 1 6 9 "economics" "basketball" 18 1 6 9 "economics" "track" 25 1 6 9 "economics" "volleyball" 26 1 7 9 "economics" "football" 20 0 7 9 "economics" "basketball" 18 0 7 9 "economics" "soccer" 22 0 8 9 "economics" "drama" 8 1 8 9 "economics" "entrepreneurship" 14 1 8 9 "economics" "basketball" 18 1 8 9 "economics" "track" 25 1 9 9 "economics" "volleyball" 26 1 10 9 "economics" "math" 11 1 10 9 "economics" "band" 13 1 10 9 "economics" "volunteering" 30 1 11 9 "physics" "baseball" 17 0 11 9 "physics" "basketball" 18 0 12 9 "physics" "baseball" 17 0 13 9 "physics" "basketball" 18 1 13 9 "physics" "volleyball" 26 1 13 9 "physics" "baseball" 17 1 14 9 "physics" "volleyball" 26 1 14 9 "physics" "spanish" 4 1 14 9 "physics" "chess" 31 1 15 9 "physics" "computer" 6 1 15 9 "physics" "volleyball" 26 1 15 9 "physics" "basketball" 18 1 15 9 "physics" "soccer" 22 1 15 9 "physics" "baseball" 17 1 16 9 "physics" "entrepreneurship" 14 1 17 9 "physics" "chorus" 15 1 17 9 "physics" "chess" 31 1 17 9 "physics" "french" 1 1 18 9 "physics" "baseball" 17 0 18 9 "physics" "football" 20 0 18 9 "physics" "chess" 31 0 18 9 "physics" "basketball" 18 0 19 9 "physics" "volleyball" 26 1 19 9 "physics" "basketball" 18 1 19 9 "physics" "digitallearning" 29 1 19 9 "physics" "chess" 31 1 19 9 "physics" "baseball" 17 1 20 9 "physics" "baseball" 17 0 21 9 "psychology" "chess" 31 1 21 9 "psychology" "entrepreneurship" 14 1 22 9 "psychology" "swimming" 23 1 22 9 "psychology" "volleyball" 26 1 22 9 "psychology" "entrepreneurship" 14 1 22 9 "psychology" "baseball" 17 1 23 9 "psychology" "football" 20 0 23 9 "psychology" "wrestling" 27 0 23 9 "psychology" "baseball" 17 0 24 9 "psychology" "baseball" 17 1 24 9 "psychology" "drama" 8 1 24 9 "psychology" "band" 13 1 24 9 "psychology" "volleyball" 26 1 25 9 "psychology" "track" 25 0 25 9 "psychology" "football" 20 0 25 9 "psychology" "wrestling" 27 0 25 9 "psychology" "basketball" 18 0 26 9 "sociology" "volleyball" 26 1 26 9 "sociology" "entrepreneurship" 14 1 27 9 "sociology" "track" 25 1 27 9 "sociology" "volleyball" 26 1 27 9 "sociology" "math" 11 1 28 9 "sociology" "baseball" 17 1 28 9 "sociology" "volleyball" 26 1 29 9 "sociology" "math" 11 1 30 9 "sociology" "football" 20 0 30 9 "sociology" "baseball" 17 0 30 9 "sociology" "basketball" 18 0 30 9 "sociology" "soccer" 22 0 31 9 "sociology" "entrepreneurship" 14 1 31 9 "sociology" "baseball" 17 1 31 9 "sociology" "volleyball" 26 1 32 9 "sociology" "track" 25 0 32 9 "sociology" "volunteering" 30 0 32 9 "sociology" "baseball" 17 0 32 9 "sociology" "fieldhockey" 19 0 32 9 "sociology" "football" 20 0 32 9 "sociology" "newspaper" 28 0 32 9 "sociology" "basketball" 18 0 32 9 "sociology" "tennis" 24 0 32 9 "sociology" "swimming" 23 0 32 9 "sociology" "entrepreneurship" 14 0 32 9 "sociology" "debate" 7 0 32 9 "sociology" "french" 1 0 32 9 "sociology" "icehockey" 21 0 end
I have individuals' ids, the code of the university (9 in that case), the string variable with the bachelor's degree name, the string variable with the name of every activity each individual takes part in, the code assigned to each activity, and a final variable with gender information, being female equals 1 and 0 otherwise.
What I would like to know, but I am still unable to figure out how, is the average gender of individuals in the same university, outside my bachelor's degree, but with whom others in my bachelor's degree have a direct contact through participation in activities. For example, individual 1, who study Biology in university 9, participates in Volleyball, Baseball and Basketball teams simultaneously. The question I attempt to respond is: What are the average gender of those individuals who belong to different bachelor's degrees and are in contact with at least one of his mates, but not with him? So first, inside Biology, I look at what different activities mates of individual 1 are involved. They are Chorus, Soccer, Football, Track and Math. Outside Biology, the average gender in Chorus, Soccer, Football, Track and Math is, respectively: 1, 0.33, 0, 0.6, and 1. I would like to know this information for each individual.
Any feedback and/or suggestions will be highly appreciated. Thanks a lot in advance for your time.
Best regards,
Daniel

Comment