Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I avoid mapFailed() error when writing to large file on system with limited memory

I have just encountered an error in my opensrc library code that allocates a large buffer for making modifications to a large flac file, the error only occurs on an old PC machine with 3Gb of memory using Java 1.8.0_74 25.74-b02 32bit

Originally I used to just allocate a buffer

ByteBuffer audioData = ByteBuffer.allocateDirect((int)(fc.size() - fc.position()));

But for some time I have it as

MappedByteBuffer mappedFile = fc.map(MapMode.READ_WRITE, 0, totalTargetSize);

My (mis)understanding was that mapped buffers use less memory that a direct buffer because the whole mapped buffer doesnt have to be in memory at the same time only the part being used. But this answer says that using mapped byte buffers is a bad idea so Im not qwuite clear how it works

Java Large File Upload throws java.io.IOException: Map failed

The full code can be seen at here

like image 454
Paul Taylor Avatar asked Nov 11 '16 10:11

Paul Taylor


2 Answers

Although a mapped buffer may use less physical memory at any one point in time, it still requires an available (logical) address space equal to the total (logical) size of the buffer. To make things worse, it might (probably) requires that address space to be contiguous. For whatever reason, that old computer appears unable to provide sufficient additional logical address space. Two likely explanations are (1) a limited logical address space + hefty buffer memory requirements, and (2) some internal limitation that the OS is imposing on the amount of memory that can be mapped as a file for I/O.

Regarding the first possibility, consider the fact that in a virtual memory system every process executes in its own logical address space (and so has access to the full 2^32 bytes worth of addressing). So if--at the point in time in which you try to instantiate the MappedByteBuffer--the current size of the JVM process plus the total (logical) size of the MappedByteBuffer is greater than 2^32 bytes (~ 4 gigabytes), then you would run into an OutOfMemoryError (or whatever error/exception that class chooses to throw in its stead, e.g. IOException: Map failed).

Regarding the second possibility, probably the easiest way to evaluate this is to profile your program / the JVM as you attempt to instantiate the MappedByteBuffer. If the JVM process' allocated memory + the required totalTargetSize are well below the 2^32 byte ceiling, but you still get a "map failed" error, then it is likely that some internal OS limit on the size of memory-mapped files is the root cause.

So what does this mean as far as possible solutions go?

  1. Just don't use that old PC. (preferable, but probably not feasible)
  2. Make sure everything else in your JVM has as low a memory footprint as possible for the lifespan of the MappedByteBuffer. (plausible, but maybe irrelevant and definitely impractical)
  3. Break that file up into smaller chunks, then operate on only one chunk at a time. (might depend on the nature of the file)
  4. Use a different / smaller buffer. ...and just put up with the decreased performance. (this is the most realistic solution, even if it's the most frustrating)

Also, what exactly is the totalTargetSize for your problem case?


EDIT:

After doing some digging, it seems clear that the IOException is due to running out of address space in a 32-bit environment. This can happen even when the file itself is under 2^32 bytes either due to the lack of sufficient contiguous address space, or due to other sufficiently large address space requirements in the JVM at the same time combined with the large MappedByteBuffer request (see comments). To be clear, an IOE can still be thrown rather than an OOM even if the original cause is ENOMEM. Moreover, there appear to be issues with older [insert Microsoft OS here] 32-bit environments in particular (example, example).

So it looks like you have three main choices.

  1. Use "the 64-bit JRE or...another operating system" altogether.
  2. Use a smaller buffer of a different type and operate on the file in chunks. (and take the performance hit due to not using a mapped buffer)
  3. Continue to use the MappedFileBuffer for performance reasons, but also operate on the file in smaller chunks in order to work around the address space limitations.

The reason I put using MappedFileBuffer in smaller chunks as third is because of the well-established and unresolved problems in unmapping a MappedFileBuffer (example), which is something you would necessarily have to do in between processing each chunk in order to avoid hitting the 32-bit ceiling due to the combined address space footprint of accumulated mappings. (NOTE: this only applies if it is the 32-bit address space ceiling and not some internal OS restrictions that are the problem... if the latter, then ignore this paragraph) You could attempt this strategy (delete all references then run the GC), but you would essentially be at the mercy of how the GC and your underlying OS interact regarding memory-mapped files. And other potential workarounds that attempt to manipulate the underlying memory-mapped file more-or-less directly (example) are exceedingly dangerous and specifically condemned by Oracle (see last paragraph). Finally, considering that GC behavior is unreliable anyway, and moreover that the official documentation explicitly states that "many of the details of memory-mapped files [are] unspecified", I would not recommend using MappedFileBuffer like this regardless of any workaround you may read about.

So unless you're willing to take the risk, I'd suggest either following Oracle's explicit advice (point 1), or processing the file as a sequence of smaller chunks using a different buffer type (point 2).

like image 63
Travis Avatar answered Sep 28 '22 05:09

Travis


When you allocate buffer, you basically get chunk of virtual memory off your operating system (and this virtual memory is finite and upper theoretical is your RAM + whatever swap is configured - whatever else was grabbed first by other programs and OS)

Memory map just adds space occupied on your on disk file to your virtual memory (ok, there is some overhead, but not that much) - so you can get more of it.

Neither of those has to be present in RAM constantly, parts of it could be swapped out to disk at any given time.

like image 23
Konstantin Pribluda Avatar answered Sep 28 '22 07:09

Konstantin Pribluda