Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Appending multiple excel files from several folders and capture specific cell as new var using XLS2DTA

    Hi Statalist,

    I wanted to append several Excel files from different folders into one dataset using XLS2DTA.
    All my variables are located in row 6, and I wanted to add a new variable (schoolid) from cell C3.

    This is my data source from the Excel forms.
    A B C D E F G H
    1 School Form 1
    2 Form 1, Master List
    3 School ID 133068 REGION 1 Division
    4 School Name ABDC ELEM SCHOOL School Year 2023
    5
    6 LRN NAME SEX BIRTHDATE AGE MOTHER TONGUE Ethnic Group
    7
    8 1330 AAAAAAAAAAAAA M 10-05-2015 7 Yakan Yakan
    9 6822 BBBBBBBBBBBBBB M 01-10-2010 13 Yakan Yakan
    10 1820 CCCCCCCCCCCCCC M 03-10-2016 7 Yakan Yakan
    Here's my code:

    clear
    set more off
    cd "C:\Users\School\SY2023-2024"
    xls2dta, gen (source schoolid) save("$temp") recursive replace : import excel "C:\Users\School\SY2023-2024", cellrange(A6) firstrow case(lower)

    local school_id = F[3] //not working/error
    gen schoolid = `school_id' // not working/error

    xls2dta, clear : append

    Thank you in advance.
    CAPI Specialist
    Data Manager

  • #2
    xls2dta (probably from SSC) is not helpful in this scenario. Here is a code snippet that might work for you (not tested though):

    Code:
    cd "C:\Users\School\SY2023-2024"
    
    clear
    
    tempfile build
    save "`build'" , emptyok
    
    filelist , pattern("*.xlsx")
    local N_files = c(N)
    
    forvalues i = 1/`N_files' {
        
        local filename = dirname[`i'] + c(dirsep) + filename[`i']
        
        clear
        
        import excel school_id = F using "`filename'" , cellrange(F3:F3)
        local school_id = school_id
        
        clear
        
        import excel using "`filename'" , cellrange(A6) firstrow case(lower)
        generate school_id = `school_id'
        
        append using "`build'"
        save "`build'" , replace
        
    }

    Comment


    • #3
      Hi Daniel Klein,

      The code you shared has no problem except for the error "dirname not found". Also, I noticed that the appended file has only one dta?
      BTW, I just copied and pasted your code.

      Thank you for your help.
      CAPI Specialist
      Data Manager

      Comment


      • #4
        Originally posted by Bernie Seville View Post
        The code you shared has no problem except for the error "dirname not found".
        [...]
        Also, I noticed that the appended file has only one dta?

        I was playing around a bit with frames but thought it would make the code more complicated. Then, I forgot to put the preserve and restore statements back into the final code. This should work:

        Code:
        cd "C:\Users\School\SY2023-2024"
        
        clear
        
        tempfile build
        save "`build'" , emptyok
        
        filelist , pattern("*.xlsx")
        local N_files = c(N)
        
        forvalues i = 1/`N_files' {
            
            local filename = dirname[`i'] + c(dirsep) + filename[`i']
            
            preserve
            
            clear
            
            import excel school_id = F using "`filename'" , cellrange(F3:F3)
            local school_id = school_id
            
            clear
            
            import excel using "`filename'" , cellrange(A6) firstrow case(lower)
            generate school_id = `school_id'
            
            append using "`build'"
            save "`build'" , replace
            
            restore
            
        }
        
        clear
        
        use "`build'"
        Last edited by daniel klein; 25 Apr 2024, 01:58.

        Comment


        • #5
          Thank you, Daniel. It works perfectly!
          CAPI Specialist
          Data Manager

          Comment

          Working...
          X