Dear Statalist members,
Assume the following problem. A folder contains many files that have names in the following format:
A_C.dta, A_D.dta, A_E.dta, B_A.dta, B_C.dta, B_D.dta, B_F.dta, E_H.dta, E_G.dta, E_K.dta.
Apart from letters, those could be numbers, again with similar form: 2568975_112565.dta, 2568975_130520.dta, 2568975_999980.dta. The key point is that what is before the (_) denotes a "family," a common component.
There is no particular order in the above forms, and there can be hundreds of such files. This is not known beforehand.
What I would like to do is append all datasets that start with the same component before the underline (_); that is, being in the same family. For example, append A_C.dta, A_D.dta, A_E.dta, or append 2568975_112565.dta, 2568975_130520.dta, 2568975_999980.dta.
After the append is done, I'd like this file to be saved with a name such as firm_number.dta. For example, firm_1.dta, firm_2.dta, firm_3.dta etc. All those numbers mean a specific family. For example, number 1 might indicate "A_" files, while 2 might indicate "B_" files. That is, there should be order in the file names.
Also, files that have been used for the append, must be deleted. Ergo, only appended files must stay in the end. For example firm_1.dta, firm_2.dta, firm_3.dta etc.
Is there a way to deal with filenames in the way described above?
Assume the following problem. A folder contains many files that have names in the following format:
A_C.dta, A_D.dta, A_E.dta, B_A.dta, B_C.dta, B_D.dta, B_F.dta, E_H.dta, E_G.dta, E_K.dta.
Apart from letters, those could be numbers, again with similar form: 2568975_112565.dta, 2568975_130520.dta, 2568975_999980.dta. The key point is that what is before the (_) denotes a "family," a common component.
There is no particular order in the above forms, and there can be hundreds of such files. This is not known beforehand.
What I would like to do is append all datasets that start with the same component before the underline (_); that is, being in the same family. For example, append A_C.dta, A_D.dta, A_E.dta, or append 2568975_112565.dta, 2568975_130520.dta, 2568975_999980.dta.
After the append is done, I'd like this file to be saved with a name such as firm_number.dta. For example, firm_1.dta, firm_2.dta, firm_3.dta etc. All those numbers mean a specific family. For example, number 1 might indicate "A_" files, while 2 might indicate "B_" files. That is, there should be order in the file names.
Also, files that have been used for the append, must be deleted. Ergo, only appended files must stay in the end. For example firm_1.dta, firm_2.dta, firm_3.dta etc.
Is there a way to deal with filenames in the way described above?

Comment