Hello,
I have a dataset that is currently in wide format in the following form:
As you can see, there are obs that are duplicated within rows that are associated with a given ID. You can see that there are also gaps in between the obs, i.e. the data is stretched longer across the rows than it could be. The dataset is unique at the level of ID, and I'd like to keep the associated relationships in wide format. What I'm trying to do is concatenate / combine the duplicate obs that occur across the rows (and the different var names), so that I can have a dataset that would look like this:
Essentially I just want to move everything "up" towards the identifying ID for ease of inspection / viewing, and most importantly to remove the duplicates that occur across the different var names (which are associated by the var`i'_`num')
Any and all help is much appreciated!
I have a dataset that is currently in wide format in the following form:
ID | name | dob | ID1_1 | name1_1 | dob1_1 | ID2_1 | name2_1 | dob2_1 | ID3_1 | name3_1 | dob3_1 |
i123 | Hubert | 1940 | |||||||||
i234 | Jenny | 1970 | i345 | Sam | 1963 | ||||||
i235 | Paul | 1962 | i768 | Katy | 1950 | i768 | Katy | 1950 | |||
i435 | Angela | 1980 | i980 | Megan | 1942 | i980 | Megan | 1942 |
As you can see, there are obs that are duplicated within rows that are associated with a given ID. You can see that there are also gaps in between the obs, i.e. the data is stretched longer across the rows than it could be. The dataset is unique at the level of ID, and I'd like to keep the associated relationships in wide format. What I'm trying to do is concatenate / combine the duplicate obs that occur across the rows (and the different var names), so that I can have a dataset that would look like this:
ID | name | dob | ID1_1 | name1_1 | dob1_1 | ID2_1 | name2_1 | dob2_1 | ID3_1 | name3_1 | dob3_1 |
i123 | Hubert | 1940 | |||||||||
i234 | Jenny | 1970 | i345 | Sam | 1963 | ||||||
i235 | Paul | 1962 | i768 | Katy | 1950 | ||||||
i435 | Angela | 1980 | i980 | Megan | 1942 | i980 | Megan | 1942 |
Essentially I just want to move everything "up" towards the identifying ID for ease of inspection / viewing, and most importantly to remove the duplicates that occur across the different var names (which are associated by the var`i'_`num')
Any and all help is much appreciated!
Comment