Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Importing and merging multiple Excel files from a folder, but skipping when the relevant sheet is not present

    Hello there,
    I am trying to merge appr. 400 excel files, ID1, ID2 and so on, into a dataset. Each file contains multiple sheets, day1, day2 etc. up to day 9, but not all sheets are present in each file.
    How do I loop over the files, skipping a file when it does not contain the relevant sheet?

    Up till now my code is:
    cd "C:\Users\Tjur\Dropbox\Studie\ny"

    local allfiles : dir . files "*.xls"
    display `allfiles'


    foreach file in `allfiles' {
    import excel using `file', sheet("day1") firstrow clear

    save "`noextension'"_day1, replace
    }

    And this code is only to import the "day1" sheet. My plan is then to merge the sheets by ID and append all the merged files to one dataset. However I can't find the solution for skipping when a sheet is not present. I have tryed using the command capture, but can't quite get it right. Also, it is likely, that there is a more elegant way about this??

    Any suggestions?
    Kind regards/
    Marianne

  • #2
    You can trap the error that occurs when Stata attempts to read a non-existent sheet. This entails using the -capture- command, and examining the return code (_rc) that every Stata command leaves behind. Then, you can contingently adjust what gets executed depending on the value of _rc. It sounds like you want to skip the entire file if any of the sheets day1-day9 are missing, which is the purpose of the continue, break in the loop. I have not tested this, but it should be pretty close.
    Code:
    foreach file in `allfiles' {
       forval i = 1/9 {
          capture import excel using `file', sheet("day`i'") firstrow clear
          if (_rc != 0) {
             display "Problem with day`i' in file = `file'"
             continue, break   // break out of forvalues, proceed to next file
          }
          else {   // no problem
             save "SomeFileName_day`i', replace
          }
       }   
    }

    Comment


    • #3
      It sounds like you need to know the number of sheets present in a given Excel file. Maybe you can get that using the describe option to import excel? You might then want to run through all files and in each file import all sheets using information from the describe option and the return results (return list)?

      Comment


      • #4
        Also see xls2dta (SSC).

        Code:
        cd "C:\Users\Tjur\Dropbox\Studie\ny"
        xls2dta , sheets(day?) : import excel *.xls , firstrow
        Followed perhaps by

        Code:
        xls2dta , clear : merge ...
        Best
        Daniel

        Comment

        Working...
        X