Hello everyone,
I am having a problem with how to use the fillin command (and whether it is the best command to use) to achieve the following:
I have a dataset with four main dimensions: Year, Firm id, Exported Product, and Destination Country. Each firm i exports a certain value of product p to destination j in year t. I want that for each Year-Firm-Product combination to generate observations for all destinations countries that ever appeared for this product p in the dataset.
This an example of how my dataset looks like if we assume that we have 2 firms, 1 product and 3 destinations:
So, Product 1 is being exported overall to 3 destinations (FRA, EGY, and TUN). So, I want that all of these destinations appear as well for Firm 1 and have zero as exported value as below:
Bearing in mind that my Year variable take values from 2005 to 2016, so I want this to be repeated for all years (i.e for each firm-product, I want to have all destinations to which this product was exported).
I would appreciate any recommendation on how to achieve this, knowing that I have around 6 million observations in my dataset, so using "fillin Year Firm Product Destination" command (and then dropping what's not needed) is not possible to produce on my computer due to memory limits since it goes up to more than 200 million observations in this case.
Many thanks in advance.
Nada
I am having a problem with how to use the fillin command (and whether it is the best command to use) to achieve the following:
I have a dataset with four main dimensions: Year, Firm id, Exported Product, and Destination Country. Each firm i exports a certain value of product p to destination j in year t. I want that for each Year-Firm-Product combination to generate observations for all destinations countries that ever appeared for this product p in the dataset.
This an example of how my dataset looks like if we assume that we have 2 firms, 1 product and 3 destinations:
Year | Firm ID | Product | Destination | Exported Value |
2005 | 1 | 1 | FRA | 100 |
2005 | 2 | 1 | FRA | 90 |
2005 | 2 | 1 | EGY | 70 |
2005 | 2 | 1 | TUN | 60 |
So, Product 1 is being exported overall to 3 destinations (FRA, EGY, and TUN). So, I want that all of these destinations appear as well for Firm 1 and have zero as exported value as below:
Year | Firm ID | Product | Destination | Exported Value |
2005 | 1 | 1 | FRA | 100 |
2005 | 1 | 1 | EGY | 0 |
2005 | 1 | 1 | TUN | 0 |
2005 | 2 | 1 | FRA | 90 |
2005 | 2 | 1 | EGY | 70 |
2005 | 2 | 1 | TUN | 60 |
I would appreciate any recommendation on how to achieve this, knowing that I have around 6 million observations in my dataset, so using "fillin Year Firm Product Destination" command (and then dropping what's not needed) is not possible to produce on my computer due to memory limits since it goes up to more than 200 million observations in this case.
Many thanks in advance.
Nada
Comment