Import multiple .txt files and create a .dta file with all of them

Rodrigo Saurin

Join Date: Aug 2016

Posts: 22
#1

Import multiple .txt files and create a .dta file with all of them

29 Aug 2016, 15:00

Dear all

I am working with a dataset that is spread in several .txt files. The files are like AC2002ID.txt, BA2002ID.txt etc where the 2 first letters are state identifiers and then the year (years go from 2002-2014). How can I load all of these .txt files for each state and year , and end up with a single .dta file that groups everything?

Thank you
Tags: None
Sergiy Radyakin

Join Date: Apr 2014

Posts: 1867
#2

29 Aug 2016, 15:16

Perhaps something along the lines of:

http://dataservices.gmu.edu/files/St...pend_files.pdf

Best, Sergiy Radyakin
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30061
#3

29 Aug 2016, 16:09

I'll just add this advice to Sergiy's. Of the two approaches outlined in the PDF he links to, I strongly recommend the "Step-by-Step" process. The reason is that in most real world compendia of data, you will find that there are inconsistencies among the files you are sent. The same variable can have different names in each file. Worse, different variables can have the same name in each file. Coding of variables can differ from one file to the next. And particularly with .csv or Excel files, what is a string variable in one file can be numeric in another. The step-by-step approach enables you to stop after each file has been separately brought into Stata, explore each file and run data-cleaning scripts to make all of the files confirm to identical variable naming, coding, data storage types, and formatting. Then you can run the loop to put them altogether.

If you use the all-at-once approach and there are problems of the type I've mentioned, cleaning up the combined file can be a nightmare, because the fixes needed for data from different files are different and often contradictory.

The only circumstance under which I would use the all-at-once approach is if I knew for certain that the data in the files are entirely consistent with each other in all of these respects. I would only feel comfortable assuming that if the source of the data were a very high quality data curator whose data sets I had previously worked with and found to meet these standards. It's pretty uncommon in practice, at least in my field.
Comment

Announcement

Import multiple .txt files and create a .dta file with all of them

Comment

Comment