Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Save only if dataset was compressed

    Hi

    I often deal with very large datasets and save a lot of space when compressing. Sometimes I have the same do files that sometimes need to handle a new dataset (because I updated it, or added some new columns) and therefore need to compress it again. I then -save, replace- the dataset, so that I then have a compressed data set.

    I often manually compress and save when I have a new dataset, but I would like to automate it. If I understood correctly the compress check if often quite quick, but the save command may often take a lot of time. Is there a way to only have STATA save the data set if it was compressed? Something like:

    Code:
    compress
    if compress==True:
       save, replace
    Last edited by Karl Tjensvoll; 17 May 2019, 05:10. Reason: clarified

  • #2
    Perhaps

    Code:
    program mycompress 
         compress 
         if c(changed) save, replace 
    end
    This saves even if thecompress didn't change the dataset but something earlier did. Naturally you can check for changes made only by compress

    Code:
    program mycompress 
         local changed = c(changed) 
         compress 
         if `changed' == 0 & c(changed) save, replace 
    end

    Comment


    • #3
      We have had discussions about what is flagged by c(changed) and what is not. It turns out, compress never changes c(changed). Therefore, neither version of mycompress will not work as intended. I suggest relying on c(width) instead.

      Code:
      program savecompressed
          version 11.2
          syntax [ , REPLACE ]
          if (c(changed) & ("`replace'"!="replace")) error 4
          local width = c(width)
          compress
          if (c(width)<`width') save , replace
      end
      Best
      Daniel
      Last edited by daniel klein; 18 May 2019, 00:01.

      Comment


      • #4
        daniel klein Good catch.

        Comment


        • #5
          Originally posted by daniel klein View Post
          We have had discussions about what is flagged by c(changed) and what is not. It turns out, compress never changes c(changed). Therefore, neither version of mycompress will not work as intended. I suggest relying on c(width) instead.

          Code:
          program savecompressed
          version 11.2
          syntax [ , REPLACE ]
          if (c(changed) & ("`replace'"!="replace")) error 4
          local width = c(width)
          compress
          if (c(width)<`width') save , replace
          end
          Best
          Daniel
          Excellent, I was not entirely aware of these -creturn- values, and it looks like exactly what I need. I am not familiar with the start of your program, the syntax the first if statement, but I should be safe with the program:

          Code:
          program define savecompressed
              local width = c(width)
              compress
              if (c(width)<`width') save, replace
          end
          Last edited by Karl Tjensvoll; 18 May 2019, 06:14.

          Comment

          Working...
          X