Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem combining imported excel sheets in one Stata file

    Dear Statalisters,
    I am trying to import several Excel sheets within one workbook and save them individually as Stata files, then loop over the saved datasets and append them in one Stata file. I was inspired by previous posts to write the code below, but I keep receiving this error message "no; data in memory would be lost". Only the first sheet is saved. I am attaching the file for easy reference and verification. Any help is highly appreciated.

    Kind regards,

    Imed.

    code:
    import excel using gdpg.xlsx, describe
    forvalues i=1/`=r(N_worksheet)' {
    local sheetname=r(worksheet_`i')
    import excel using gdpg.xlsx, sheet("`sheetname'") firstrow
    tempfile file_`sheetname'
    save "`file_`sheetname''" , replace
    }

    * loop over the saved datasets and append them
    clear
    use "`file_sheet1'"
    forvalues i=2/`=r(N_worksheet)' {
    append using "`file_sheet`i''"
    }

    Attached Files

  • #2
    Please show exactly what you typed into Stata and what you got back, enclosing all of this within CODE delimiters, as explained in the Forum FAQ. (If you don't know what I mean, please re-read the FAQ.) You've provided a commentary on your problem, whereas exact details are likely to be essential to know about and, in addition, simple copy/paste is much less legible than material in CODE. (Also, be aware that many readers are unlikely to download Excel spreadsheeets.)

    Comment


    • #3
      While I would second Stephen's advice, I believe you receive the error message since your "import excel" statement within the loop does not contain the option "clear", thus, when Stata comes back to the import excel command for the second time, it finds data already loaded into memory, and you do not tell Stata to "clear" the memory first.

      Comment


      • #4
        You save the files as `file_`sheetname'' in the first block of code, then try to append `file_sheet`i'' in the second block. This will not work.

        The problem you report is likely produced by the line

        Code:
        import excel using gdpg.xlsx, sheet("`sheetname'") firstrow
        in the first part of your code, because there is no clear option. First time thru the code Stata saves the imported file. The second time thru the loop Stata will not import the next Excel sheet, because the one imported before is still in memory.

        Again, some advertisement for xls2dta (SSC) which does essentially what the first part of Imed's code does, by typing

        Code:
        xls2dta using gdpg.xlsx ,allsheets
        I am currently about to finish an update that will facilitate appending the files on the fly, so that the underlying problem can be solved by one line of code.

        Best
        Daniel
        Last edited by daniel klein; 22 May 2015, 02:50.

        Comment


        • #5
          Daniel's comment is more complete, I admit I haven't looked past the import excel thing, so Imed, please use Daniel's suggestions.
          Also, @ Daniel: thanks for pointing to the xls2dta package (and for the time and effort you probably spent writing it), if it works as advertised, that should be pretty helpful, especially on-the-fly appending.

          Comment


          • #6
            Thank you all for the feedback and suggestions.
            Daniel xls2dta is a great shortcut. However, I do not understand, however, why the individual files have the variable names line as data. 'firstrow' does not seem to be an option. Is there a way to get around the problem? I hope the updated version includes the option of appending the saved file into one file. Automating this procedure would be a big plus. Thanks again.

            Comment


            • #7
              Daniel xls2dta is a great shortcut. However, I do not understand, however, why the individual files have the variable names line as data. 'firstrow' does not seem to be an option. Is there a way to get around the problem?
              xls2dta is, as stated, really just a fancy wrapper for import excel. Thus, any options allowed with the latter may be specified. Coding

              Code:
              xls2dta using gdpg.dta ,allsheets firstrow
              should work just fine.

              Best
              Daniel
              Last edited by daniel klein; 22 May 2015, 04:05.

              Comment


              • #8
                For whatever it's worth here is the revised code. It is working fine now. I used merge instead of append, but this was not the issue.

                Kind regards,

                Imed.

                Code:
                import excel using gdpg.xlsx, describe
                clear
                forvalues i=1/`=r(N_worksheet)' {
                  local sheetname=r(worksheet_`i')
                  clear
                  import excel using gdpg.xlsx, sheet("`sheetname'") firstrow
                  tempfile file_`sheetname'
                  save "`file_`sheetname''" ,
                }
                
                * loop over the saved datasets and merge them
                clear
                use "`file_sheet1'"
                forvalues i=2/`=r(N_worksheet)' {
                  merge using "`file_sheet`i''"
                }

                Comment

                Working...
                X