Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Appending sheets from odbc connection (xlsx. and xlsm)

    Hi Stata,

    I have multiple files in xlsx and xlsm, and I wanted to append them using a loop in Stata 14. I have used the command below, which works perfectly; however, I have hundreds to append.

    clear
    set more off
    odbc load, dsn(workplan) table(Aliho$) lowercase clear
    ren workplan_period period
    ren thematic__area thematic
    save "C:\Users\Workplan stata\aliho.dta", replace

    clear
    set more off
    odbc load, dsn(workplan) table(Comm$) lowercase clear
    ren workplan_period period
    ren var15 dateend
    save "C:\Users\Workplan stata\comm.dta", replace

    clear
    set more off
    odbc load, dsn(workplan) table(Deng$) lowercase clear
    ren var16 duration
    save "C:\Users\Workplan stata\deng.dta", replace

    * append
    clear
    set more off
    cd "C:\Users\Workplan stata"
    use "aliho.dta", clear
    append using "comm.dta"
    append using "deng.dta"

    * renvar
    ren workplan_period period
    ren thematic__area thematic
    ren activity__reference__number arn
    ren component_ component
    ren title_of__activity title
    ren expected__output output
    ren var9 type
    ren var11 personsinvolved
    ren var12 loccitymuni
    ren location_barangay locbrgy
    ren var14 datestart
    ren var15 dateend
    ren var16 duration
    ren var17 pax
    ren var18 linkthematic
    ren var19 category
    ren var22 cohort
    ren var23 status
    ren var25 sharepoint

    * save
    save append.dta, replace


    Thanks,
    Bernie



    CAPI Specialist
    Data Manager

  • #2
    If your path only contains dta files for appending, this should work:
    Code:
    clear all
    local filelist: dir "." files "*.dta"
    foreach file of local filelist {
        append using `file'
    }
    Hopefully all the common variables are stored as same data types.

    Comment


    • #3
      Thanks Daniel,

      Is there a way to loop these tables (below)? I have plenty of these, and I'm looking for the simplest way to do it.
      odbc load, dsn(workplan) table(Aliho$) lowercase clear
      odbc load, dsn(workplan) table(Comm$) lowercase clear
      odbc load, dsn(workplan) table(Deng$) lowercase clear
      odbc load, dsn(workplan) table(Ting$) lowercase clear

      Append....
      or
      Merge...

      Save...
      CAPI Specialist
      Data Manager

      Comment


      • #4
        There's a user-written command -xls2dta- (installable from SSC) that occasionally comes up on the List in the context of tasks similar to what you seem to need to do. I've not used it before, but I understand that it can be used to append a bunch of Excel worksheets into a Stata file.

        If you're committed to using ODBC and you have a list of the names of the worksheets that you want to append, then you can loop through it to construct a UNION statement in SQL and execute that. I show an illustration of that below as well as of an alternative approach that uses a -tempfile- for successive -append- via a loop over the worksheet names. Begin at the "Begin here" comment; the top part just creates an Excel workbook for use in illustration.

        .ÿ
        .ÿversionÿ17.0

        .ÿ
        .ÿclearÿ*

        .ÿ
        .ÿ//ÿseedem
        .ÿsetÿseedÿ1094971057

        .ÿ
        .ÿprogramÿdefineÿcreatEm
        ÿÿ1.ÿÿÿÿÿversionÿ17.0
        ÿÿ2.ÿÿÿÿÿsyntaxÿ,ÿworkbook(string)ÿworksheet(string)
        ÿÿ3.ÿ
        .ÿÿÿÿÿdropÿ_all
        ÿÿ4.ÿÿÿÿÿquietlyÿsetÿobsÿ3
        ÿÿ5.ÿ
        .ÿÿÿÿÿgenerateÿdoubleÿnumvarÿ=ÿruniform()
        ÿÿ6.ÿÿÿÿÿgenerateÿstrÿstrvarÿ=ÿstrofreal(numvar,ÿ"%21x")
        ÿÿ7.ÿ
        .ÿÿÿÿÿquietlyÿexportÿexcelÿusingÿ"`c(pwd)'/`workbook'.xlsx",ÿ///
        >ÿÿÿÿÿÿÿÿÿsheet("`worksheet'")ÿfirstrow(variables)
        ÿÿ8.ÿend

        .ÿ
        .ÿforeachÿworksheetÿinÿAlihoÿCommÿDengÿTingÿ{
        ÿÿ2.ÿÿÿÿÿcreatEmÿ,ÿworkbook(MyWorkbook)ÿworksheet(`worksheet')
        ÿÿ3.ÿ}

        .ÿ
        .ÿ*
        .ÿ*ÿBeginÿhere
        .ÿ*
        .ÿlocalÿworksheet_listÿAlihoÿCommÿDengÿTing

        .ÿ
        .ÿlocalÿworksheet_tallyÿ:ÿwordÿcountÿ`worksheet_list'

        .ÿforvaluesÿiÿ=ÿ1/`=`worksheet_tally'ÿ-ÿ1'ÿ{
        ÿÿ2.ÿÿÿÿÿlocalÿworksheetÿ:ÿwordÿ`i'ÿofÿ`worksheet_list'
        ÿÿ3.ÿÿÿÿÿlocalÿsql_statementÿ`sql_statement'ÿ///
        >ÿÿÿÿÿÿÿÿÿSELECTÿ*ÿFROMÿ[`worksheet'$]ÿUNIONÿALL
        ÿÿ4.ÿ}

        .ÿlocalÿworksheetÿ:ÿwordÿ`worksheet_tally'ÿofÿ`worksheet_list'

        .ÿlocalÿsql_statementÿ`sql_statement'ÿSELECTÿ*ÿFROMÿ[`worksheet'$];

        .ÿ
        .ÿlocalÿdsnÿExcelÿFiles;DBQ=MyWorkbook.xlsx;DefaultDir=`c(pwd)'/;

        .ÿ
        .ÿodbcÿload,ÿexec("`sql_statement'")ÿdsn("`dsn'")ÿlowercaseÿclear

        .ÿlist,ÿnoobsÿseparator(0)

        ÿÿ+-----------------------------------+
        ÿÿ|ÿÿÿÿnumvarÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿstrvarÿ|
        ÿÿ|-----------------------------------|
        ÿÿ|ÿÿ.7110077ÿÿÿ+1.6c093379fc443X-001ÿ|
        ÿÿ|ÿ.90618659ÿÿÿ+1.cff7b025d8361X-001ÿ|
        ÿÿ|ÿ.00689478ÿÿÿ+1.c3db37fc2f780X-008ÿ|
        ÿÿ|ÿ.47972728ÿÿÿ+1.eb3da100d09caX-002ÿ|
        ÿÿ|ÿ.80179354ÿÿÿ+1.9a84aea7dbd2fX-001ÿ|
        ÿÿ|ÿÿ.4324123ÿÿÿ+1.baca4a75d64d6X-002ÿ|
        ÿÿ|ÿ.03380472ÿÿÿ+1.14eda302d2410X-005ÿ|
        ÿÿ|ÿ.59611193ÿÿÿ+1.3135953b54ac3X-001ÿ|
        ÿÿ|ÿ.13480088ÿÿÿ+1.14127b3226ebcX-003ÿ|
        ÿÿ|ÿ.93968116ÿÿÿ+1.e11de384c4f53X-001ÿ|
        ÿÿ|ÿ.46822838ÿÿÿ+1.df7742d0d20eaX-002ÿ|
        ÿÿ|ÿ.46707704ÿÿÿ+1.de49715e1d2faX-002ÿ|
        ÿÿ+-----------------------------------+

        .ÿ
        .ÿ//ÿAlternatively
        .ÿtempfileÿcache

        .ÿquietlyÿimportÿexcelÿusingÿMyWorkbook.xlsx,ÿdescribe

        .ÿ
        .ÿlocalÿworksheet_list

        .ÿforvaluesÿiÿ=ÿ`r(N_worksheet)'(-1)1ÿ{
        ÿÿ2.ÿÿÿÿÿlocalÿworksheet_listÿ`worksheet_list'ÿ`r(worksheet_`i')'
        ÿÿ3.ÿ}

        .ÿ
        .ÿlocalÿfirstÿ1

        .ÿforeachÿworksheetÿofÿlocalÿworksheet_listÿ{
        ÿÿ2.ÿÿÿÿÿquietlyÿimportÿexcelÿusingÿMyWorkbook.xlsx,ÿsheet("`worksheet'")ÿ///
        >ÿÿÿÿÿÿÿÿÿfirstrowÿcase(lower)ÿclear
        ÿÿ3.ÿ
        .ÿÿÿÿÿifÿ`first'ÿ{
        ÿÿ4.ÿÿÿÿÿÿÿÿÿquietlyÿsaveÿ`cache'
        ÿÿ5.ÿÿÿÿÿÿÿÿÿlocalÿfirstÿ0
        ÿÿ6.ÿÿÿÿÿ}
        ÿÿ7.ÿÿÿÿÿelseÿ{
        ÿÿ8.ÿÿÿÿÿÿÿÿÿappendÿusingÿ`cache'
        ÿÿ9.ÿÿÿÿÿÿÿÿÿquietlyÿsaveÿ`cache',ÿreplace
        ÿ10.ÿÿÿÿÿ}
        ÿ11.ÿ}

        .ÿlist,ÿnoobsÿseparator(0)

        ÿÿ+-----------------------------------+
        ÿÿ|ÿÿÿÿnumvarÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿstrvarÿ|
        ÿÿ|-----------------------------------|
        ÿÿ|ÿÿ.7110077ÿÿÿ+1.6c093379fc443X-001ÿ|
        ÿÿ|ÿ.90618659ÿÿÿ+1.cff7b025d8361X-001ÿ|
        ÿÿ|ÿ.00689478ÿÿÿ+1.c3db37fc2f780X-008ÿ|
        ÿÿ|ÿ.47972728ÿÿÿ+1.eb3da100d09caX-002ÿ|
        ÿÿ|ÿ.80179354ÿÿÿ+1.9a84aea7dbd2fX-001ÿ|
        ÿÿ|ÿÿ.4324123ÿÿÿ+1.baca4a75d64d6X-002ÿ|
        ÿÿ|ÿ.03380472ÿÿÿ+1.14eda302d2410X-005ÿ|
        ÿÿ|ÿ.59611193ÿÿÿ+1.3135953b54ac3X-001ÿ|
        ÿÿ|ÿ.13480088ÿÿÿ+1.14127b3226ebcX-003ÿ|
        ÿÿ|ÿ.93968116ÿÿÿ+1.e11de384c4f53X-001ÿ|
        ÿÿ|ÿ.46822838ÿÿÿ+1.df7742d0d20eaX-002ÿ|
        ÿÿ|ÿ.46707704ÿÿÿ+1.de49715e1d2faX-002ÿ|
        ÿÿ+-----------------------------------+

        .ÿ
        .ÿeraseÿ`c(pwd)'/MyWorkbook.xlsx

        .ÿ
        .ÿexit

        endÿofÿdo-file


        .


        I don't know how much of -import excel- was available in Stata Release 14.

        Comment


        • #5
          Do you need to have the xls* files as ODBC? If not, importing directly from the xls* files as Joseph suggested would be a better approach (I believe import excel is available in Stata 14). Here's some code you can try:
          Code:
          clear all
          tempfile tempdata
          foreach table in Aliho Comm Deng Ting {
          odbc load, dsn(workplan) table(`table'$) lowercase clear
          your code for cleanup
          save `tempdata', replace
          clear
          capture use outfile, clear
          append using `tempdata'
          save outfile, replace
          }

          Comment


          • #6
            Thanks, Daniel and Joseph, for sharing your expertise; this is what I need.
            Just to share with you why I used an ODBC connection on my Excel dataset: this is because sometimes Stata will not load data from Excel. "Element not found."

            Thanks again,

            Bernie



            CAPI Specialist
            Data Manager

            Comment


            • #7
              That makes me think there's something funny with how the sheets are named in the excel file, like spaces or special characters. Check the elements that import excel can't find. You may need to wrap the names in double quotes or use some macro escapes.

              Comment


              • #8
                Yes, you are correct. i attached notification from failed stata import. I made many tries and failed.
                Attached Files
                CAPI Specialist
                Data Manager

                Comment


                • #9
                  This error message has been sporadically reported on the List for years, for example, here, here and here.

                  It appears to result from some occasional peculiar corruption of the Excel workbook when it is created in some environments. You might want to take a look at some the workarounds that are suggested in one or another of threads linked to above.

                  First, though, I recommend that you make sure that your Stata Release 14 installation has been updated as recommended by the StataCorp staff member here.

                  Comment


                  • #10
                    Perhaps open the original XLSX file and resave as XLS, then try to import. I looked through Joseph's links and for someone this resolved the issue. If could also post a portion of offending file sans sensitive data, we could take a stab at it.

                    Comment


                    • #11
                      Hi Daniel,

                      I'm trying to do MERGE, failed again. here's my code.

                      clear
                      tempfile awp
                      local files axejF3uqKGPbhNtuFtcDws pax
                      local vars index
                      save `awp', emptyok
                      foreach sheetx in axejF3uqKGPbhNtuFtcDws pax {
                      import excel using "https://qwerty.org/api/v1/data/734768.xlsx", sheet(`sheetx') firstrow case(lower)
                      *data clean here
                      gen source = `sheetx'
                      merge m:1 index using `index', keepusing(`v')
                      save awp.dta, replace
                      }


                      CAPI Specialist
                      Data Manager

                      Comment


                      • #12
                        There are some issues with your code, or you're using some pseudo code.
                        1. Before you were appending (i.e. adding observations) the sheets together. Now it seems like you're making your data wide. What is your intention? Do the sheets contain different data for same individuals/groups, or are the data distinct? This will change how you combine the data.
                        2. Your local macros are not being used at all, and they don't seem necessary.

                        Comment


                        • #13
                          Apologies for not explaining well.
                          This is totally a different dataset, and I'm trying to do a merge or wide.
                          This link, https://qwerty.org/api/v1/data/734768.xlsx, contains two sheets. Sheet 1 (parent name) and sheet 2 (child name).

                          My sample data.
                          ParentName Index ChildName ParentIndex
                          John Doe 1 Precy 1
                          Tistan 1
                          Bonna 2 Trish 2
                          Luna 2

                          My goal should be like the sample below. Merge 1:m
                          ParentName Index ChildName ParentIndex
                          John Doe 1 Precy 1
                          John Doe 1 Tistan 1
                          Bonna 2 Trish 2
                          Bonna 2 Luna 2

                          I hope I was able to explain it well to you.

                          CAPI Specialist
                          Data Manager

                          Comment


                          • #14
                            See next post.
                            Last edited by Daniel Shin; 20 Feb 2023, 15:58. Reason: Apologies, error in posting.

                            Comment


                            • #15
                              I couldn't access the linked file (domain did not resolve) to check, but this only works if you have only two sheets and the first sheet contains index with no duplicates.

                              Code:
                              clear
                              tempfile awp
                              save `awp', emptyok
                              foreach sheetx in axejF3uqKGPbhNtuFtcDws pax {
                              import excel using "https://qwerty.org/api/v1/data/734768.xlsx", sheet(`sheetx') firstrow case(lower)
                              *data clean here
                              gen source = `sheetx'
                              capture merge m:1 index using `awp', keepusing(varlist) // varlist from the original sheet
                              save `awp', replace
                              }
                              Consider this as an exercise in using macros and loops; the code is too complex in my opinion.

                              Comment

                              Working...
                              X