Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Question about scanning through directory and compress all stata files

    Hello, I just wanted to ask if I might run into any issues running the code below. We have some terabytes of datasets that those I work for never compress, and we are now running out of space. I used the code below on a subfolder, and it ran without immediate issues.

    I just wanted to ask some more experienced fellas if they could see some issues with this code, since erasing any of the datasets would be disasterous!

    What I am doing is creating a list of all .dta files in that directory, compressing it, and saving it if, and only if, it actually was compressed.

    Code:
    clear all
    cd "path"
    
    * Get list of all datasets in directory
    local files_to_compress : dir . files "*.dta"
    
    * Loop over .dta files in directory
    foreach file of local files_to_compress {
        * Display current file and load it
        di "`file'"
        use `file', clear
    
        * Check the size of the file before compression and start compression
        local width = c(width)
        compress
    
        * If file was compressed, save it, else load next dataset
        if (c(width))<`width' save, replace
    }

  • #2
    Potential problems:
    1. local macro files_to_compress might hit the limits if the list of files is long
    2. e(sample), if originally contained in the files, might be lost; see option all in save
    3. value labels that are not attached to variables might be lost; see option orphans in save
    4. if the original file was saved in an older version of Stata, older versions of Stata can no longer open the replaced filed; see saveold
    There might be more issues.

    Best
    Daniel

    Comment

    Working...
    X