Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • New version of -filelist- on SSC

    Thanks to Kit Baum, a new version of filelist is now available on SSC.

    filelist searches a directory for files that match a specific pattern and continues searching for files recursively in all its subdirectories. Search results either replace the data in memory or are saved as a Stata dataset on disk.

    This new version is written in Mata to overcome a problem with directories that contain a very large number of files (whose names cannot all fit into a single macro).

    I've also found a way to discover the file size with virtually no overhead, so the search results include the file size in bytes as well.

    Here's an example where I search my whole Documents directory for all Stata datasets

    Code:
    . cd "/Users/robert/Documents"
    /Users/robert/Documents
    
    . filelist, pattern(*.dta)
    Number of files found = 7675
    
    . sum fsize
    
        Variable |       Obs        Mean    Std. Dev.       Min        Max
    -------------+--------------------------------------------------------
           fsize |      7672     5222725    4.10e+07        278   1.52e+09
    
    . dis %20.0gc r(sum)
          40,068,745,379
    To update to the new version, type in Stata's command window:

    Code:
    adoupdate filelist, update
    To install filelist for the first time, type

    Code:
    ssc install filelist

  • #2
    Robert, thank you for the program.

    In the help file, nor is shown as the abbreviation for the norecursive option but this option can only be used if at least norec is spelled out.
    Code:
    . filelist, nor
    option nor not allowed
    r(198);
    
    . filelist, norec
    Number of files found = 2

    Comment


    • #3
      You are correct and thanks for pointing out the discrepancy. I've updated the program to accept the minimal nor option. I'll hold on for a few day before sending the updated version to Kit just in case something else comes up.

      Comment


      • #4
        Thanks to Kit Baum, a new version of filelist is now available on SSC.

        The new version incorporates the following changes:
        • The minimum syntax of the norecursive option is now nor (thanks Friedrich for pointing this out in #2)
        • The Stata timers are no longer cleared; thanks again to Friedrich for pointing out the issue in this post.
        • Zero-length files now show a file size of 0 (previously, the fsize variable contained a missing value).
        With most recent Stata 13.1 update (revision 03 Jun 2015), filelist is now able to report the file size of files that are larger than 2GB when used on a 64-bit version of Stata.

        To update to the new version, type in Stata's command window

        Code:
        adoupdate filelist, update
        To install filelist for the first time, type

        Code:
        ssc install filelist

        Comment


        • #5
          Robert Picard I was the original poster who asked about recursively obtaining file lists. I've been semi-converted to R, but I'm back with another extended question! Is it possible to extend this package to include information on when the file was last modified? I thought I had solved it in R but it didnt work so I'm back to my old favourite Stata to help me!

          Comment

          Working...
          X