I have a data set that I would like your help in it
The data is for general population how have been following up in a hospital ( 100,680 obs)
Step 1 : There is some individuals who came only once and those I would like to drop them ( I think this is the code )
bysort ID: drop if _N==1
Step 2 : I want to keep the 1st observation when a disease had occur
for example
ID | exam day | DM |
1 | 20200601 | 0 |
1 | 20201201 | 0 |
1 | 20210201 | 1 |
2 | 20190101 | 0 |
2 | 20190202 | 0 |
2 | 20190601 | 1 |
2 | 20191001 | 0 |
2 | 20200202 | 1 |
3 | 20180606 | 1 |
ID | exam day | DM |
1 | 20200601 | 0 |
1 | 20210201 | 1 |
2 | 20190101 | 0 |
2 | 20190601 | 1 |
Comment