Hi Everyone,
I am working with Stata 17 on Mac.
I have project-level data on energy production across provinces in Canada. My goal is to do some analysis at the firm-month level across Canada by combining these datasets.
My problem is that each province publishes its own data, so the identifiers are not consistent across datasets. Moreover, so datasets have changed their coding over time. For example, in one dataset, Shell is coded as "SHELL", while in the other it is coded as "Shell Canada Energy"
Data from B.C.:
Data from Alberta:
The best solution I can think of is to manually recode all identifiers, but there are hundreds of firms across hundreds of months. So, I'm hoping there is another method that doesn't involve me manually scanning through all of the firm identifiers
Thank you for all of your suggestions/solutions
Jerome
I am working with Stata 17 on Mac.
I have project-level data on energy production across provinces in Canada. My goal is to do some analysis at the firm-month level across Canada by combining these datasets.
My problem is that each province publishes its own data, so the identifiers are not consistent across datasets. Moreover, so datasets have changed their coding over time. For example, in one dataset, Shell is coded as "SHELL", while in the other it is coded as "Shell Canada Energy"
Data from B.C.:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str16(OPERATOR well_id) "SHELL" "100162907819W600" "SHELL" "100090708018W600" end
Data from Alberta:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str43 OPERATOR str11 BATTERY "Shell Canada Energy" "ABBT0147964" "Shell Canada Energy" "ABBT0143302" end
The best solution I can think of is to manually recode all identifiers, but there are hundreds of firms across hundreds of months. So, I'm hoping there is another method that doesn't involve me manually scanning through all of the firm identifiers
Thank you for all of your suggestions/solutions
Jerome

Comment