Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to sum multiple rows of data in one row according to day categorical variable

    Hello all,

    I hope you can help.

    I have a dataset which has a column for the date/day a mosquito was collected, a column filled to show the number of mosquitoes observed (always 1) and another column describing whether that mosquito subsequently laid an egg following observation in binary (0= did not lay, 1= did lay).

    With the current format, there is a separate row for each individual mosquito/day but I would like to reshape this to sum the total number of mosquitoes observed and which laid eggs on each day. In the first attached image below you can see a simplified version of the data in its current format and in the second how I would like to reshape it.

    Hope this makes sense.

    Thank you,

    Tom

    Click image for larger version

Name:	Screenshot 2024-04-15 114528.png
Views:	1
Size:	21.7 KB
ID:	1749987
    Click image for larger version

Name:	Screenshot 2024-04-15 114534.png
Views:	1
Size:	6.9 KB
ID:	1749988

  • #2
    I found a solution of sorts for this using the egen argument to generate a new column of data summing the numbers laid by day and duplicates drop argument to remove the duplicate rows of data:

    egen total_laid = total( laid), by(day)

    duplicates drop day total_laid, force

    Comment

    Working...
    X