I have a data file where there are several rows per person (ID) and both string and numeric variables with only one value per person. I would like to fill the missing rows by person ID.
There are several similar topics on this forum which address this situation however, as far as I am aware, none address the case when there is no entry for one ID.
A simplified version of the data are as follows:
As you can see, the single value for each ID is not always in the first row, ID 3 has no varstr value and ID 2 has no varnum value. When filling, these should be kept as missing.
I have tried
however this only works when all IDs have exactly one non-missing value.
Advice much appreciated.
There are several similar topics on this forum which address this situation however, as far as I am aware, none address the case when there is no entry for one ID.
A simplified version of the data are as follows:
Code:
clear input ID str8 varstr varnum 1 "A" . 1 "" 4.2 1 "" . 1 "" . 1 "" . 2 "" . 2 "G" . 2 "" . 2 "" . 2 "" . 3 "" 7.6 3 "" . 3 "" . 3 "" . 3 "" . end list, sepby(ID)
Code:
+----------------------+ | ID varstr varnum | |----------------------| 1. | 1 A . | 2. | 1 4.2 | 3. | 1 . | 4. | 1 . | 5. | 1 . | |----------------------| 6. | 2 . | 7. | 2 G . | 8. | 2 . | 9. | 2 . | 10. | 2 . | |----------------------| 11. | 3 7.6 | 12. | 3 . | 13. | 3 . | 14. | 3 . | 15. | 3 . | +----------------------+
I have tried
Code:
bysort ID (varstr) : replace varstr= varstr[_N]
Advice much appreciated.
Comment