Good afternoon,
I have run into a problem while trying to merge two datasets.
The first dataset contains two variables. CN8, which is a variable that contains unique values of eight-digit codes. Like 2012090. The other variable "sanctioned" equals 1 for each of the 2837 observations. So the dataset is a list of products that are covered by the sanction I study.
The second dataset also contains two variables. The first one is "product_id", which is ranging from 1 to 3124 for all observations. The second is CN8. The way this dataset differs from the first dataset is that the creators here aggregated to some extent on CN8-product codes. So where the first dataset contained only one eight-digit code, the second one contains a group of CN8-product codes for each observation. To illustrate, one observation has a value that equals 27 for product_id and 61061000 61062000 61069010 61069030 61069050 61069090 for CN8. The number of CN8 codes differs across observations. Some product_id's only contain one CN8 code, most are a list of multiple codes and some even contain 50 codes or more.
I want to merge on CN8. What I want to achieve is that for each product_id, the variable "sanctioned" equals one if at least one of the list of CN8 codes it is comprised of (e.g. 61061000 61062000 61069010 61069030 61069050 61069090) is part of the CN8 codes of the first dataset, and equals zero otherwise. As for as I know, there is no command to directly merge when a unique value of CN8-product codes is part of a list of codes in the variable that I am merging on. Is there anyone who has a suggestion on how I can achieve this.
Thanks in advance
Kind Regards,
Nels Westerhuis
I have run into a problem while trying to merge two datasets.
The first dataset contains two variables. CN8, which is a variable that contains unique values of eight-digit codes. Like 2012090. The other variable "sanctioned" equals 1 for each of the 2837 observations. So the dataset is a list of products that are covered by the sanction I study.
The second dataset also contains two variables. The first one is "product_id", which is ranging from 1 to 3124 for all observations. The second is CN8. The way this dataset differs from the first dataset is that the creators here aggregated to some extent on CN8-product codes. So where the first dataset contained only one eight-digit code, the second one contains a group of CN8-product codes for each observation. To illustrate, one observation has a value that equals 27 for product_id and 61061000 61062000 61069010 61069030 61069050 61069090 for CN8. The number of CN8 codes differs across observations. Some product_id's only contain one CN8 code, most are a list of multiple codes and some even contain 50 codes or more.
I want to merge on CN8. What I want to achieve is that for each product_id, the variable "sanctioned" equals one if at least one of the list of CN8 codes it is comprised of (e.g. 61061000 61062000 61069010 61069030 61069050 61069090) is part of the CN8 codes of the first dataset, and equals zero otherwise. As for as I know, there is no command to directly merge when a unique value of CN8-product codes is part of a list of codes in the variable that I am merging on. Is there anyone who has a suggestion on how I can achieve this.
Thanks in advance
Kind Regards,
Nels Westerhuis
Comment