I have a data file where there are several rows per person (ID) and both string and numeric variables with only one value per person. I would like to fill the missing rows by person ID.
There are several similar topics on this forum which address this situation however, as far as I am aware, none address the case when there is no entry for one ID.
A simplified version of the data are as follows:
As you can see, the single value for each ID is not always in the first row, ID 3 has no varstr value and ID 2 has no varnum value. When filling, these should be kept as missing.
I have tried
however this only works when all IDs have exactly one non-missing value.
Advice much appreciated.
There are several similar topics on this forum which address this situation however, as far as I am aware, none address the case when there is no entry for one ID.
A simplified version of the data are as follows:
Code:
clear input ID str8 varstr varnum 1 "A" . 1 "" 4.2 1 "" . 1 "" . 1 "" . 2 "" . 2 "G" . 2 "" . 2 "" . 2 "" . 3 "" 7.6 3 "" . 3 "" . 3 "" . 3 "" . end list, sepby(ID)
Code:
+----------------------+
| ID varstr varnum |
|----------------------|
1. | 1 A . |
2. | 1 4.2 |
3. | 1 . |
4. | 1 . |
5. | 1 . |
|----------------------|
6. | 2 . |
7. | 2 G . |
8. | 2 . |
9. | 2 . |
10. | 2 . |
|----------------------|
11. | 3 7.6 |
12. | 3 . |
13. | 3 . |
14. | 3 . |
15. | 3 . |
+----------------------+
I have tried
Code:
bysort ID (varstr) : replace varstr= varstr[_N]
Advice much appreciated.

Comment