Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • extracting filenames in a directory

    Hello: I am trying to extract all filenames in a selected directory and place list them all in a variable for processing and matching.

    This topic is related to this old post:
    https://www.stata.com/statalist/arch.../msg01030.html

    However, the directories I want to work with contain many files, 100,000 and counting.

    using the command:
    Code:
    local list : dir . files "*"
    does not work because it errors as too many filenames, detailing that I cannot encode a str variable with greater than 65,536 unique values.

    There is another package available called -filelist- that also has a limit of 10,000 files.

    Are there any solutions as to how to tackle this which such long filelists in the the directory?

    Thanks!

  • #2
    I can't tell from what you wrote if you are trying to extract the files in a single directory or in multiple directories. Either way, try breaking the task into pieces. If you have a single directory, the pieces might correspond to all files whose names begin with a, those beginning with b, etc. If it's multiple directories, each directory might be a chunk. Then use -filelist- in a loop over those chunks, and append the results as you go. So, for example, if it's a single directory with 100,000 files you could do this:

    Code:
    clear
    tempfile building
    save `building', emptyok
    
    foreach x in `c(alpha)' {
        filelist, pattern(`x'*), replace
        append using `building'
        save `"`building'"', replace
    }
    At the end, both tempfile `"`building'"' and active memory will contain all the files (whose names begin with a lower case letter). Modify this to come up with a scheme that will cover all the files you need.

    Note: code not tested--consider only as a prototype.

    Comment


    • #3
      Another possibility would be to forego a purely Stata-ish solution, and use the facilities of your operating system to put the output of its directory command into a text file and import that into Stata. For example, under Windows, you could do:
      Code:
      cd WhateverDirectory
      !dir/b * > SomeFile.txt
      import delimited using SomeFile.txt, clear

      Comment

      Working...
      X