Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Memory-mapping huge files in Java

Is it possible to memory-map huge files (multiple GBs) in Java?

This method of FileChannel looks promising:

MappedByteBuffer map(FileChannel.MapMode mode, long position, long size)

Both position and size allow for 64-bit values -- so far, so good.

MappedByteBuffer, however, only provides methods for 32-bit positions (get(int index), position(int newPosition), etc.), which seems to imply that I cannot map files larger than 2 GB.

How can I get around this limitation?

like image 541
Tony the Pony Avatar asked Mar 22 '19 13:03

Tony the Pony


People also ask

How big are memory mapped files?

Memory-mapped files cannot be larger than 2GB on 32-bit systems. When a memmap causes a file to be created or extended beyond its current size in the filesystem, the contents of the new part are unspecified.

Are memory mapped files faster?

Accessing memory mapped files is faster than using direct read and write operations for two reasons. Firstly, a system call is orders of magnitude slower than a simple change to a program's local memory.

What is MappedByteBuffer in Java?

public abstract class MappedByteBuffer extends ByteBuffer. A direct byte buffer whose content is a memory-mapped region of a file. Mapped byte buffers are created via the FileChannel. map method. This class extends the ByteBuffer class with operations that are specific to memory-mapped file regions.


2 Answers

Take a look at Using a memory mapped file for a huge matrix code which shows how to create a list of MappedByteBuffer, each smaller then 2 GB, to map the entire file:

private static final int MAPPING_SIZE = 1 << 30;
...
long size = 8L * width * height;
for (long offset = 0; offset < size; offset += MAPPING_SIZE) {
    long size2 = Math.min(size - offset, MAPPING_SIZE);
    mappings.add(raf.getChannel().map(FileChannel.MapMode.READ_WRITE, offset, size2));
}

As per JDK-6347833 (fs) Enhance MappedByteBuffer to support sizes >2GB on 64 bit platforms the reason for the 2 GB limit is:

A MappedByteBuffer is a ByteBuffer with additional operations to support memory-mapped file regions. To support mapping a region larger than Integer.MAX_VALUE would require a parallel hierarchy of classes. For now the only solution is create multiple MappedByteBuffers where each corresponds to a region that is no larger than 2GB.

like image 121
Karol Dowbecki Avatar answered Oct 16 '22 06:10

Karol Dowbecki


As mentioned, MappedByteBuffer has the 2GB limitation due to usage of integer index/position pointers.

To get around that you could use an alternative implementation like larray

like image 21
Gonzalo Matheu Avatar answered Oct 16 '22 08:10

Gonzalo Matheu