Hello,
I am using Stata 14.2 on Windows. This is my first post so I hope I am doing this correctly.
The dataset I am using contains around 100.000 observations with information about buildings.
Each building has an ID number like 344100000000006, followed by an adress, (..some more variables that are not important for the question) and the function (labeled with values 1 - 12).
One building can contain multiple living units, a store on the ground floor etc. These units are all seperate observations with the same building ID (so they will have the same adress and only (if) differ in function). Therefore one building ID can occur for example 16 times.
I want to know which buildings have more than one function, like building with ID 344100000000042, which is used for both function 3 and 12.
I am not interested in buildings with only one function so I want to drop them from the data set.
I believe I need to combine different observations with the same ID into one, and while this is an issue I found many forumusers are struggeling with, I am not experienced enough with Stata to apply suggestions to other problems to my own case. Therefore I sincerely hope someone is willing to help me.
edit: I also created a variable 'multiplebuilding' indicating how many times a building ID occurs. Don't know if it might be helpful.
The data looks like this: (I excluded other variables that are not important to the question)
* Example generated by -dataex-. To install: ssc install dataex
clear
input double gebwbagidgetal long gebruiksdoel_n float multiplebuilding
344100000000006 12 2
344100000000006 12 2
344100000000008 12 2
344100000000008 12 2
344100000000011 12 3
344100000000011 12 3
344100000000011 12 3
344100000000014 12 16
344100000000014 12 16
344100000000014 12 16
344100000000014 12 16
344100000000014 12 16
344100000000014 12 16
344100000000014 12 16
344100000000014 12 16
344100000000014 12 16
344100000000014 12 16
344100000000014 12 16
344100000000014 12 16
344100000000014 12 16
344100000000014 12 16
344100000000014 12 16
344100000000014 12 16
344100000000016 12 2
344100000000016 12 2
344100000000029 12 5
344100000000029 12 5
344100000000029 12 5
344100000000029 12 5
344100000000029 12 5
344100000000039 12 5
344100000000039 12 5
344100000000039 12 5
344100000000039 12 5
344100000000039 12 5
344100000000041 12 2
344100000000041 12 2
344100000000042 3 2
344100000000042 12 2
344100000000053 12 2
344100000000053 12 2
344100000000061 3 6
344100000000061 12 6
344100000000061 12 6
344100000000061 12 6
344100000000061 12 6
344100000000061 12 6
344100000000064 12 2
344100000000064 12 2
344100000000074 12 9
344100000000074 12 9
344100000000074 3 9
344100000000074 12 9
344100000000074 12 9
344100000000074 12 9
344100000000074 12 9
344100000000074 12 9
344100000000074 12 9
344100000000079 12 5
344100000000079 12 5
344100000000079 12 5
344100000000079 12 5
344100000000079 12 5
344100000000082 12 2
344100000000082 3 2
344100000000084 12 3
344100000000084 3 3
344100000000084 12 3
344100000000089 12 7
344100000000089 12 7
344100000000089 12 7
344100000000089 12 7
344100000000089 12 7
344100000000089 12 7
344100000000089 12 7
344100000000090 12 3
344100000000090 12 3
344100000000090 12 3
344100000000091 3 2
344100000000091 12 2
344100000000098 3 2
344100000000098 12 2
344100000000102 3 2
344100000000102 12 2
344100000000106 12 2
344100000000106 12 2
344100000000109 3 2
344100000000109 12 2
344100000000114 3 2
344100000000114 3 2
344100000000116 12 48
344100000000116 12 48
344100000000116 12 48
344100000000116 12 48
344100000000116 12 48
344100000000116 12 48
344100000000116 12 48
344100000000116 12 48
344100000000116 12 48
344100000000116 12 48
end
label values gebruiksdoel_n gebruiksdoel_n
label def gebruiksdoel_n 3 "gemengd", modify
label def gebruiksdoel_n 12 "woonfunctie", modify
[/CODE]
I am using Stata 14.2 on Windows. This is my first post so I hope I am doing this correctly.
The dataset I am using contains around 100.000 observations with information about buildings.
Each building has an ID number like 344100000000006, followed by an adress, (..some more variables that are not important for the question) and the function (labeled with values 1 - 12).
One building can contain multiple living units, a store on the ground floor etc. These units are all seperate observations with the same building ID (so they will have the same adress and only (if) differ in function). Therefore one building ID can occur for example 16 times.
I want to know which buildings have more than one function, like building with ID 344100000000042, which is used for both function 3 and 12.
I am not interested in buildings with only one function so I want to drop them from the data set.
I believe I need to combine different observations with the same ID into one, and while this is an issue I found many forumusers are struggeling with, I am not experienced enough with Stata to apply suggestions to other problems to my own case. Therefore I sincerely hope someone is willing to help me.
edit: I also created a variable 'multiplebuilding' indicating how many times a building ID occurs. Don't know if it might be helpful.
The data looks like this: (I excluded other variables that are not important to the question)
* Example generated by -dataex-. To install: ssc install dataex
clear
input double gebwbagidgetal long gebruiksdoel_n float multiplebuilding
344100000000006 12 2
344100000000006 12 2
344100000000008 12 2
344100000000008 12 2
344100000000011 12 3
344100000000011 12 3
344100000000011 12 3
344100000000014 12 16
344100000000014 12 16
344100000000014 12 16
344100000000014 12 16
344100000000014 12 16
344100000000014 12 16
344100000000014 12 16
344100000000014 12 16
344100000000014 12 16
344100000000014 12 16
344100000000014 12 16
344100000000014 12 16
344100000000014 12 16
344100000000014 12 16
344100000000014 12 16
344100000000014 12 16
344100000000016 12 2
344100000000016 12 2
344100000000029 12 5
344100000000029 12 5
344100000000029 12 5
344100000000029 12 5
344100000000029 12 5
344100000000039 12 5
344100000000039 12 5
344100000000039 12 5
344100000000039 12 5
344100000000039 12 5
344100000000041 12 2
344100000000041 12 2
344100000000042 3 2
344100000000042 12 2
344100000000053 12 2
344100000000053 12 2
344100000000061 3 6
344100000000061 12 6
344100000000061 12 6
344100000000061 12 6
344100000000061 12 6
344100000000061 12 6
344100000000064 12 2
344100000000064 12 2
344100000000074 12 9
344100000000074 12 9
344100000000074 3 9
344100000000074 12 9
344100000000074 12 9
344100000000074 12 9
344100000000074 12 9
344100000000074 12 9
344100000000074 12 9
344100000000079 12 5
344100000000079 12 5
344100000000079 12 5
344100000000079 12 5
344100000000079 12 5
344100000000082 12 2
344100000000082 3 2
344100000000084 12 3
344100000000084 3 3
344100000000084 12 3
344100000000089 12 7
344100000000089 12 7
344100000000089 12 7
344100000000089 12 7
344100000000089 12 7
344100000000089 12 7
344100000000089 12 7
344100000000090 12 3
344100000000090 12 3
344100000000090 12 3
344100000000091 3 2
344100000000091 12 2
344100000000098 3 2
344100000000098 12 2
344100000000102 3 2
344100000000102 12 2
344100000000106 12 2
344100000000106 12 2
344100000000109 3 2
344100000000109 12 2
344100000000114 3 2
344100000000114 3 2
344100000000116 12 48
344100000000116 12 48
344100000000116 12 48
344100000000116 12 48
344100000000116 12 48
344100000000116 12 48
344100000000116 12 48
344100000000116 12 48
344100000000116 12 48
344100000000116 12 48
end
label values gebruiksdoel_n gebruiksdoel_n
label def gebruiksdoel_n 3 "gemengd", modify
label def gebruiksdoel_n 12 "woonfunctie", modify
[/CODE]
Comment