Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Error message when using "collapse": I/O error writing .dta file

    Dear all,

    I am using STATA/MP 13.1. I want to collapse a dataset using the collapse command.

    ----------------------------------------------------------------------------------------------------------------

    collapse (sum) sum_amount_of_loan=amount_of_loan, by(year type_of_action county_ID)

    ----------------------------------------------------------------------------------------------------------------

    However, I receive the following error message:

    ----------------------------------------------------------------------------------------------------------------

    I/O error writing .dta file
    Usually such I/O errors are caused by the disk or file system being full.
    r(693);

    ----------------------------------------------------------------------------------------------------------------

    The dataset is quite big with 16,751,979 observations. If I only use a small subsample of the data, the command works, so it seems to be a size issue. However, with previous similar datasets (of even bigger size), I did not have this problem.

    Using the search function, I only found this topic: http://www.statalist.org/forums/foru...torage-problem where, however, the problem was not solved.

    Any help how to deal with this issue is much appreciated.


    Best regards

    Carlo

  • #2
    As the other thread you linked to suggests, the problem is probably created by a -preserve- command in the code underlying -collapse-. -preserve- creates a copy of your data set in memory, which is what is likely leading to the I/O error. Try using the fast option with -collapse- so that it doesn't preserve (but first I would make sure your data set is backed up somewhere). If that doesn't work, try running -compress- before your collapse command to reduce the amount of memory taken up by your data set by optimizing variable types.

    Comment


    • #3
      Hi Sean, thank you for your reply.

      In the meantime, the same error message has shown up in another do-file of mine when I was trying to merge two datasets using the -merge- command. So it seems a more fundamental issue not directly related to the -collapse- command. I will write here once I have solved the issue.

      Best regards

      Carlo

      Comment


      • #4
        If Carlo is running Stata on a UNIX system, it is possible that the filesystem in which Stata writes temporary files is where the problem lies. Following the advice in the Stata temporary files FAQ that Google led me to, I ran the following test.

        Code:
        . tempfile gnxl
        
        . display "`gnxl'"
        /var/folders/xr/lm5ccr996k7dspxs35yqzyt80000gp/T//S_01933.000001
        Now this was run on Stata/SE 13.1 for Mac (64-bit Intel) Revision 19 Dec 2014, but OS X is really UNIX under the hood, so I'd expect something similar on UNIX systems. And I've used too many UNIX systems where filesystems like /var and /tmp were not large enough to hold files the size that Stata may find it necessary to create. As well as systems where old temporary files (like those described in the FAQ) were not cleaned up automatically as part of periodic housekeeping. In both cases system administrators need to be involved in resolving the problems that result.

        I'm looking forward to hearing what Carlo discovers.

        Comment


        • #5
          If the use of preserve within collapse was the problem (since preserve creates a copy of your data set in memory, and your data set is large), the same problem could be occuring in merge.

          If you search the ado file for merge, you will see that it also uses preserve (snippet of the source code below):
          Code:
                  
               if (!r(sorted)) {
                      preserve
                      qui use "`using'", clear
                      sort `varlist'
                      tempfile using
                      qui save "`using'", replace
                      restore
               }
          ​Since this is the only time preserve appears in merge.ado, if the underlying problem is in fact the use of preserve you can avoid this problem by sorting your data set first using sort <variables you are merging on> so that r(sorted) will equal 1 and the loop above will be skipped.
          Last edited by Sean Higgins; 23 Jan 2015, 16:52. Reason: Still figuring out formatting code on Statalist

          Comment


          • #6
            Dear all,

            for the record, here is the solution to my problem, even though it is not really related to STATA:

            To make it short: The hint in the STATA error message "Usually such I/O errors are caused by the disk or file system being full." was exactly the problem in my case. My C:-drive, where my STATA version was installed, was 99.9% full. And for other, STATA-unrelated reasons, this was not properly displayed. Since then, I have repartioned my hard drives, and all the commands work since then.

            Thanks a lot though to Sean and William for their suggestions!

            Best regards

            Carlo

            Comment


            • #7
              Hi all,

              I updated my profile.do with the line below, and this issue never bothered my again.

              Code:
              set checksum off
              best,
              Pablo
              Best,
              Pablo Bonilla

              Comment

              Working...
              X