Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    OK, that means your data set is too large to be contained in memory along with the matrix that -runby- needs to create from it. This is one gigantic data set.

    Try adding the -useappend- option to your -runby- command and see if it will run that way. The -useappend- approach is not quite as speedy as the normal way -runby- works, but it uses less memory.

    If that doesn't work, you will need to break your data set up into chunks, each chunk consisting of all of the observations of some set of schools. Then run the code on each chunk, save the results. And then in the end, -append- all of those individual results together. The larger each chunk is, the more the efficiency gains, but each chunk has to be small enough that you don't run into memory problems with -runby-. It may take some trial and error to get that right.

    By the way, before doing any of that, you should test out the code on a smallish sized subset of your data and check that the results look right. You don't want to invest a lot of time fiddling with the approach only to get incorrect results in the end.

    Comment


    • #17
      I haven't looked at the -runby- source code, but
      Code:
      <tmp>[132180040,1]
      sure looks like an attempt to allocate a vector with 132 million floats, which if repeated several times might take more space than you have in memory. The r(3900) return code is used for "out of memory" by at least one other command. Your file looks like it might compress to half its size, but that probably wouldn't affect the allocation. I have no good ideas if that is indeed the problem.

      Comment

      Working...
      X