Hello,
I am working with monthly data where I compute the change in total industry employment between various industries from one month to the next.
Below shows two rows of my spreadsheet. L_industry is the industry in the month before and industry the one in the current month (e.g., 1681 individuals moved from industry 170 to 0 between two months and 1140 into the other directions - indicated by tot_trans and tot_trans_1; the (absolute) difference is then 541).
For each combination of industry and L_industry, there is an additional row in which transitions go into the other direction, in the first case from 0 to 170.
I am interested in the difference of industry transitons in both directions and managed to prepare the data in a way that I have total transitions into both directions in one row. However I want to drop all observations where it goes into the other direction. That means that I want to drop the cases where transitions go from 0 to 170, 170 to 6170 etc., as it contains the same information.
I thought about dropping duplicates of d_tot_trans but there are numerous observations with different industries where d_tot_trans is the same.
My current approach is to create two groups, group_1 is the industry group and group_2 the L_industry group. Group_1 and group_2 are equal to 1 both in the case of transitions from 170 to 0 for the former of transitions from 0 to 170 for the latter.
I thus want to drop observations where group_2 takes on a value that already exist in group_1, e.g., 1 for the example above. I am not sure however how to write the code for this and would appretiate any help with that or an alternative approach to achieve my goal. Thank you!
I am working with monthly data where I compute the change in total industry employment between various industries from one month to the next.
Below shows two rows of my spreadsheet. L_industry is the industry in the month before and industry the one in the current month (e.g., 1681 individuals moved from industry 170 to 0 between two months and 1140 into the other directions - indicated by tot_trans and tot_trans_1; the (absolute) difference is then 541).
For each combination of industry and L_industry, there is an additional row in which transitions go into the other direction, in the first case from 0 to 170.
I am interested in the difference of industry transitons in both directions and managed to prepare the data in a way that I have total transitions into both directions in one row. However I want to drop all observations where it goes into the other direction. That means that I want to drop the cases where transitions go from 0 to 170, 170 to 6170 etc., as it contains the same information.
I thought about dropping duplicates of d_tot_trans but there are numerous observations with different industries where d_tot_trans is the same.
My current approach is to create two groups, group_1 is the industry group and group_2 the L_industry group. Group_1 and group_2 are equal to 1 both in the case of transitions from 170 to 0 for the former of transitions from 0 to 170 for the latter.
I thus want to drop observations where group_2 takes on a value that already exist in group_1, e.g., 1 for the example above. I am not sure however how to write the code for this and would appretiate any help with that or an alternative approach to achieve my goal. Thank you!
industry | L_industry | tot_trans | tot_trans_1 | d_tot_trans | group_1 | group_2 |
0 | 170 | 1681 | 1140 | 541 | 1 | 257 |
170 | 6170 | 16 | 14 | 2 | 363 | 14898 |
Comment