Hello everyone,
I have a data set of unique customers (rows) and up to 10 of their suppliers (columns). My data set is in wide format because I am taking averages across years (columns). I am trying to calculate the average size of the suppliers in my data set per year, but this is complicated by the fact that two customers can have the same supplier. When taking an average, thus, I do not want to double-count suppliers.
Additionally, it is not necessary that a customer has 10 suppliers, it can be less (see C4, Supplier 2 does not exist below).
An example of my data set is as follows.
So for example, how can I get the average assets of the suppliers in 2009? Is there a way to remove duplicates across multiple variables?
Thank you!
Panos
I have a data set of unique customers (rows) and up to 10 of their suppliers (columns). My data set is in wide format because I am taking averages across years (columns). I am trying to calculate the average size of the suppliers in my data set per year, but this is complicated by the fact that two customers can have the same supplier. When taking an average, thus, I do not want to double-count suppliers.
Additionally, it is not necessary that a customer has 10 suppliers, it can be less (see C4, Supplier 2 does not exist below).
An example of my data set is as follows.
Customer | Assets (2009) | Assets (2008) | Supplier 1 | Assets (2009) | Assets (2008) | Supplier 2 | Assets (2009) | Assets (2008) | ... Supplier 10 |
C1 | AAA | BBB | |||||||
C2 | CCC | DDD | |||||||
C3 | AAA | EEE | |||||||
C4 | FFF | . | |||||||
C5 | GGG | AAA |
Thank you!
Panos
Comment