Hi everyone,
I have a dataset that looks like this:
I'm getting the total amount for each ID and group:
And then keeping only one row:
(I know I can do the same with collapse, but there are other variables I'd like to keep)
As you can see, B does not have any amount in group 2, so the result will be
However, I need missing groups to show up as "0" in the dataset:
What is the most elegant way to do this?
My current idea is to loop through all IDs and groups adding rows with amount = 0 before the egen command. This seems ugly and probably slower than it needs to be. I could also check whether a row exists before adding the new one, but I presume adding rows is not that expensive of an operation, and it would only make the code uglier. Is there a better way?
Thanks in advance!
I have a dataset that looks like this:
ID | group | amount |
A | 1 | 2 |
A | 2 | 3 |
A | 1 | 4 |
A | 2 | 1 |
B | 1 | 2 |
B | 1 | 1 |
I'm getting the total amount for each ID and group:
Code:
egen group_amount = total(amount), by(ID group)
Code:
bysort ID group: keep if _n == 1
As you can see, B does not have any amount in group 2, so the result will be
ID | group | group_amount |
A | 1 | 6 |
A | 2 | 4 |
B | 1 | 3 |
However, I need missing groups to show up as "0" in the dataset:
ID | group | group_amount |
A | 1 | 6 |
A | 2 | 4 |
B | 1 | 3 |
B | 2 | 0 |
What is the most elegant way to do this?
My current idea is to loop through all IDs and groups adding rows with amount = 0 before the egen command. This seems ugly and probably slower than it needs to be. I could also check whether a row exists before adding the new one, but I presume adding rows is not that expensive of an operation, and it would only make the code uglier. Is there a better way?
Thanks in advance!
Comment