Hello,
I am trying to use "import delimited" to import data from multiple .csv files. The .csv contains 180 columns/variables and I am interested in using only a few of them. Variables' names are written in the first row of each .csv. Is there a way to tell Stata "upload only those columns whose header (variable name) is X Y and Z?
I am already aware that:
Any help or guidance would be greatly appreciated.
Best,
Edoardo
PS: Stata version: 16.0 - OS: both Windows and iOS.
I am trying to use "import delimited" to import data from multiple .csv files. The .csv contains 180 columns/variables and I am interested in using only a few of them. Variables' names are written in the first row of each .csv. Is there a way to tell Stata "upload only those columns whose header (variable name) is X Y and Z?
I am already aware that:
- I could import the whole .csv and "keep" those variables. However, this is extremely inefficient (each .csv weighs 1GB and I have 120 of them). So this is not a viable option. I want to import only the variables I need to speed up the process.
- "import delimited" has the option "colrange". There are two problems here. Firstly, colrange asks for the number of the column I want to upload but I don't know the number, I know the variable name. Secondly, my variables of interest are not grouped next to one another, they do not form a contiguous subset of the data. However, colrange appears to me to require a contiguous range of column numbers to upload.
Any help or guidance would be greatly appreciated.
Best,
Edoardo
PS: Stata version: 16.0 - OS: both Windows and iOS.
Comment