Hello everyone!
I'm learning using Stata software and this forum.
I am writing this post because I need your help!
I'd like to know if a command could help me identify duplicates of observations in each 30 days interval in a database of isolated bacteria.
In other words, I want to identify all bacteria isolated in the same patient (observations or rows in our database) that have the same characteristics in other variables, but they were isolated in a period of 30 days. I want to drop these bacteria because they are considered the same sample for our analysis.
For example, in the image, we have two patients with different isolates of bacteria. In the second one, we have Enterococcus faecium isolated five times on different days but all of them are inside a period of 30 days, this means that it is repeated and it isn't useful for the analysis. How can I identify and drop these duplicates?
Thanks a lot!
Edith.

I'm learning using Stata software and this forum.
I am writing this post because I need your help!
I'd like to know if a command could help me identify duplicates of observations in each 30 days interval in a database of isolated bacteria.
In other words, I want to identify all bacteria isolated in the same patient (observations or rows in our database) that have the same characteristics in other variables, but they were isolated in a period of 30 days. I want to drop these bacteria because they are considered the same sample for our analysis.
For example, in the image, we have two patients with different isolates of bacteria. In the second one, we have Enterococcus faecium isolated five times on different days but all of them are inside a period of 30 days, this means that it is repeated and it isn't useful for the analysis. How can I identify and drop these duplicates?
Thanks a lot!
Edith.

Comment