I have a dataset (example attached below) of orders a shopper has delivered to a customer after shopping from a certain store. I have the order_id (which is unique), a courier_id (which is the id of the shopper, that repeats itself since the same shopper can deliver multiple orders), a date_delivered (which is the date on which the delivery was made), and a store_id (which is the id of the store, which repeats itself as different shoppers might have shopped from the same store). I want to create a new variable "familiarity" that indicates how familiar a shopper is to a particular store, which indicates how many previous orders has a particular shopper shopped from the same store. I believe I need to use the "count" function that counts the number of times a shopper shopped from a particular store till that delivery date. I would appreciate any help/assistance with the code regarding this. Thanks!
Example of the data:
order_id store_id courier_id date_delivered
1001 11 101 10may2021
1002 10 102 11may2021
1003 12 103 12may2021
1004 12 104 12may2021
1005 12 103 13may2021
1006 13 102 13may2021
The result I want is as follows:
order_id store_id courier_id date_delivered familiarity
1001 11 101 10may2021 1
1002 10 102 11may2021 1
1003 12 103 12may2021 1
1004 12 104 12may2021 1
1005 12 103 13may2021 2
1006 13 102 13may2021 1
So, the courier id 103 has shopped in store id 12 for 1 time till delivery date of 12may, but 2 times till the delivery date of 13may. How should I create this? Also, if sorting is required, should the orders be sorted by delivery_date?
Thanks again for any help regarding this!
Example of the data:
order_id store_id courier_id date_delivered
1001 11 101 10may2021
1002 10 102 11may2021
1003 12 103 12may2021
1004 12 104 12may2021
1005 12 103 13may2021
1006 13 102 13may2021
The result I want is as follows:
order_id store_id courier_id date_delivered familiarity
1001 11 101 10may2021 1
1002 10 102 11may2021 1
1003 12 103 12may2021 1
1004 12 104 12may2021 1
1005 12 103 13may2021 2
1006 13 102 13may2021 1
So, the courier id 103 has shopped in store id 12 for 1 time till delivery date of 12may, but 2 times till the delivery date of 13may. How should I create this? Also, if sorting is required, should the orders be sorted by delivery_date?
Thanks again for any help regarding this!
Comment