Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • loop to import many Excel files

    Hi,

    I've created a file using the filelist command

    filelist , dir("C:/Users/mikel/OneDrive/Desktop/results") pattern(recon2022_*.xlsx) save("files.dta")

    The resulting file files.dta contains the usual three variables: dirname filename fsize. I've added to it another variable for simpler naming of stata files that will result via

    gen fips=substr(filename,11,5)

    I'm having trouble importing any of the with a loop, however. I'm sure it's because I don't entirely understand the syntax conventions of loops in Stata. I've tried both forvalues and foreach. For example,

    local N=_N
    forvalues i=1/'N' {
    use "files.dta" in 'i',clear
    local f = dirname + "/" + filename
    import excel "'f'", sheet("vectors") firstrow clear
    local source = fips
    .
    .
    .
    gen source="'source'"
    save "C:/Users/mikel/OneDrive/Desktop/results/vectors_stata/'source'.dta", replace
    }


    I've also tried

    foreach filename {
    clear
    import excel "C:/Users/mikel/OneDrive/Desktop/results/'filename'", sheet("vectors") firstrow clear
    local source = fips
    .
    .
    .
    gen source="'source'"
    save "C:/Users/mikel/OneDrive/Desktop/results/vectors_stata/'source'.dta", replace
    }


    I can import the files one at a time as long as I specify the full directory and filename. There are only 52 of them (one for each of the 50 states, DC, and the US). But I know I must be close and would hate to cave into such rudimentary approach. What am I doing wrong?

  • #2
    So there are several problems with the code you show. First, -local N = _N- will not work properly because at the time you execute it, there is no file in memory. Your local macro syntax is wrong. Local macros are not referenced by enclosure between two single quotes (' '). They are referenced by enclosure between a backtick (`) [immediately to the left of the 1! key on US keyboards] and a single quote ('). Next, your -import excel- command wipes out the files data in memory, so when you get to the second iteration of your loop, the key data is absent. So to make this approach work, you would need to re-load files.dta on each iteration of the loop. (That said, I think it is better to use an approach that does not wipe out files.dta in the first place and avoid thrashing the disk on every iteration.) The syntax -use "files.dta" in `i'- is incorrect. Assuming your intent is to read in the i'th observation in files.dta, the correct syntax would be -use in `i' using files.dta-.

    So:
    Code:
    clear*
    use files
    
    frame create excel
    
    forvalues i = 1/`=_N' {
        local path = dirname[`i'] + "/" + filename[`i']
        local source = fips[`i']
        frame excel {
            import excel `"`path'"', sheet("vectors") firstrow clear
            gen source = `"`source'"'
            // PERHAPS OTHER DATA MANAGEMENT
            save "C:/Users/mikel/OneDrive/Desktop/results/vectors_stata/`source'.dta", replace
        }
    }

    Comment


    • #3
      Ah, the backtick was certainly an issue. I obviously didn't gather that clearly from other posts on the matter. The rest, I would have probably gotten eventually by either dropping and substituting. Thanks so much, Clyde!! It's working. Next, appending after adding a label variable in each file with fips and merging with a master file to do a little fracreg and betareg.

      Comment

      Working...
      X