I have a problem where I want to import several sheets of an Excel file into Stata.
My code looks like this.
First, I just describe the excel file to store the number of worksheets, as in r(N_worksheet). Then, I open the first relevant sheet (named Sheet2), and save it as a .dta-file, in order to then loop through sheets 3 to N, appending them to the .dta file one after another, saving, and continuing the loop. I append by closing the "main file" that I got from the first sheet (Sheet2), namely d_datastream2.dta, then clearing and importing the respective next sheet from the excel file (3-N), saving it as d_datastream`i'.dta, closing it, opening the main file and appending using the just saved d_datastream`i'.dta file. In the end, I delete the additional .dta-files that were created in the process, and rename d_datastream2.dta to d_datastream.dta for further use.
So before the loop, I have the first sheet, and in each iteration of the loop, I add data from an additional sheet.
The Excelfile is named d_returnretrieve1.xlsm, the Sheets are just named Sheet2, ..., Sheet9
The problem is the following:
It gives me "invalid syntax" when running this snip. If I change the line
to
then it works flawlessly.
So I note that for some reason, the scalar variable N is not saved throughout the loop, or not accessible. However, the alternative,
i.e., trying to re-define N within the loop, gives the same error.
However, the following code, where I replace N with 9 in the loop specs:
correctly displays the N in every iteration of the loop.
What could be the problem?
My code looks like this.
First, I just describe the excel file to store the number of worksheets, as in r(N_worksheet). Then, I open the first relevant sheet (named Sheet2), and save it as a .dta-file, in order to then loop through sheets 3 to N, appending them to the .dta file one after another, saving, and continuing the loop. I append by closing the "main file" that I got from the first sheet (Sheet2), namely d_datastream2.dta, then clearing and importing the respective next sheet from the excel file (3-N), saving it as d_datastream`i'.dta, closing it, opening the main file and appending using the just saved d_datastream`i'.dta file. In the end, I delete the additional .dta-files that were created in the process, and rename d_datastream2.dta to d_datastream.dta for further use.
So before the loop, I have the first sheet, and in each iteration of the loop, I add data from an additional sheet.
The Excelfile is named d_returnretrieve1.xlsm, the Sheets are just named Sheet2, ..., Sheet9
Code:
clear *start with importing and saving the first sheet, and save number of worksheets: import excel using d_returnretrieve1.xlsm, sheet("Sheet2") firstrow scalar N = r(N_worksheet) save d_datastream2.dta, replace display N *import sheets 3 to end (number of sheets): forval i=3/N { *import sheet i and save as dummyfile import excel using d_returnretrieve1.xlsm, sheet("Sheet`i'") firstrow clear save d_datastream`i'.dta, replace *close sheet i, open sheet 2 (main file) use d_datastream2.dta, clear *append sheet i from dummyfile behind sheet 2 append using d_datastream`i'.dta *save new sheet 2 save d_datastream2.dta, replace } save d_datastream.dta, replace *delete the above dummyfiles forval i = 2/9 { erase d_datastream`i'.dta }
It gives me "invalid syntax" when running this snip. If I change the line
Code:
forval i=3/N {
Code:
forval i=3/9 {
So I note that for some reason, the scalar variable N is not saved throughout the loop, or not accessible. However, the alternative,
Code:
*import sheets 3 to end (number of sheets): forval i=3/N { *import sheet i and save as dummyfile import excel using d_returnretrieve1.xlsm, sheet("Sheet`i'") firstrow clear save d_datastream`i'.dta, replace *close sheet i, open sheet 2 use d_datastream2.dta, clear *append sheet i from dummyfile behind sheet 2 append using d_datastream`i'.dta *save new main file save d_datastream2.dta, replace *save number of worksheets again clear import excel using d_returnretrieve1.xlsm, describe scalar N = r(N_worksheet) }
However, the following code, where I replace N with 9 in the loop specs:
Code:
forval i=3/9 { *import sheet i and save as dummyfile import excel using d_returnretrieve1.xlsm, sheet("Sheet`i'") firstrow clear save d_datastream`i'.dta, replace *close sheet i, open sheet 2 use d_datastream2.dta, clear *append sheet i from dummyfile behind sheet 2 append using d_datastream`i'.dta *save new sheet 2 save d_datastream2.dta, replace *check number of worksheets again display N }
What could be the problem?
Comment