Hi everyone,
so first here is an example of my data:
/
/
/
/
/
These are cities (with areas inside them) and small areas (villages) from israel, with some demografic information and number of voters (and potential voters) in the elections. Every city has its own "locality code", and a few statistical areas if it's a big city or just one statistical area if it's a small village (so every observation is a statistical area).
/
What I'm trying to do is to join the population (pop) in some small areas with that of different areas, and then to get rid of the observations of small areas. I want to use the variables: "new locality code" and "new statistical area", so that if they are identical to "locality code" and "statistical area" of a different observation respectively, I would like to add the values of "pop" and "pop_18" from the first observation to those of the second observation, and then loose the first observation.
For instance, the village "EN KAREM-B.S.HAQLA'I", should be deleted after its pop and pop_18 was joined to sttistical area 1352 in the city of JERUSALEM (of which locality code is 3000). And the same goes for the Village EN HAZEVA, that should be joined to HAZEVA (sorry for the confusing names).
The last two observations of HAIFA are to demonstrate that there are cities to which I won't join any small village. Actually, only about 25 small villages should be joined and deleted, while the total observations (statistical areas) in my data is about 2800.
Any help with recommended commands?
Thanks ahead,
Ben Ronen.
so first here is an example of my data:
/
/
locality code | city name | statistical area | pop | pop_18 | potential | voters | _merge | new locality code | new statistical area | new city name |
1053 | EN HAZEVA | 1 | 43.08881 | 22.98231 | 0 | 0 | 1 | 13 | 1 | HAZEVA |
13 | HAZEVA | 1 | 561.4722 | 388.3561 | 474 | 362 | 3 | |||
1056 | EN KAREM-B.S.HAQLA'I | 1 | 161.2622 | 37.2866 | 0 | 0 | 1 | 3000 | 1352 | JERUSALEM |
3000 | JERUSALEM | 1351 | 3565.347 | 2688.487 | 2781 | 2129 | 3 | |||
3000 | JERUSALEM | 1352 | 2727.852 | 2036.572 | 2742 | 1564 | 3 | |||
3000 | JERUSALEM | 1353 | 2438.363 | 1814.717 | 1804 | 1122 | 3 | |||
4000 | HAIFA | 111 | 2125.539 | 1416.01 | 1768 | 1185 | 3 | |||
4000 | HAIFA | 112 | 5445.942 | 3247.853 | 3250 | 2640 | 3 |
/
/
These are cities (with areas inside them) and small areas (villages) from israel, with some demografic information and number of voters (and potential voters) in the elections. Every city has its own "locality code", and a few statistical areas if it's a big city or just one statistical area if it's a small village (so every observation is a statistical area).
/
What I'm trying to do is to join the population (pop) in some small areas with that of different areas, and then to get rid of the observations of small areas. I want to use the variables: "new locality code" and "new statistical area", so that if they are identical to "locality code" and "statistical area" of a different observation respectively, I would like to add the values of "pop" and "pop_18" from the first observation to those of the second observation, and then loose the first observation.
For instance, the village "EN KAREM-B.S.HAQLA'I", should be deleted after its pop and pop_18 was joined to sttistical area 1352 in the city of JERUSALEM (of which locality code is 3000). And the same goes for the Village EN HAZEVA, that should be joined to HAZEVA (sorry for the confusing names).
The last two observations of HAIFA are to demonstrate that there are cities to which I won't join any small village. Actually, only about 25 small villages should be joined and deleted, while the total observations (statistical areas) in my data is about 2800.
Any help with recommended commands?
Thanks ahead,
Ben Ronen.
Comment