I think it make not be clear as to what I mean by a running appending file.
I have a yearly files for a dataset, 2002Data, 2001Data, etc. till 2022Data. I also have a single file called Mapping. I want to merge each of the yearly file m:1 using key1 with the Mapping file, keep only the _merge==3 and then finally append all them together.
This is a very straightforward loop of merging all of them, keeping _merge==3, saving it. Then at the end of the loop just append all the (merged+cleaned) file. The only problem is that each of the yearly file is huge. I mean 6Gb huge. I dont know how big the yearly merged+cleaned file will be. I dont have enough space on my PC to save each cleaned file and then finally append all of them together and thus another file that will be equally large as all these files combined. I want to make a running append that appends right after merging+cleaning, so its only one big file in the end.
Any help will be highly appreciated.
I have a yearly files for a dataset, 2002Data, 2001Data, etc. till 2022Data. I also have a single file called Mapping. I want to merge each of the yearly file m:1 using key1 with the Mapping file, keep only the _merge==3 and then finally append all them together.
This is a very straightforward loop of merging all of them, keeping _merge==3, saving it. Then at the end of the loop just append all the (merged+cleaned) file. The only problem is that each of the yearly file is huge. I mean 6Gb huge. I dont know how big the yearly merged+cleaned file will be. I dont have enough space on my PC to save each cleaned file and then finally append all of them together and thus another file that will be equally large as all these files combined. I want to make a running append that appends right after merging+cleaning, so its only one big file in the end.
Any help will be highly appreciated.
Comment