Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading a binary file in Java vs C++

I have a binary file (about 100 MB) that I need to read in quickly. In C++ I could just load the file into a char pointer and march through it by incrementing the pointer. This of course would be very fast.

Is there a comparably fast way to do this in Java?

like image 475
poy Avatar asked Aug 07 '11 20:08

poy


People also ask

How does Java handle binary data?

A binary literal is a number that is represented in 0s and 1s (binary digits). Java allows you to express integral types (byte, short, int, and long) in a binary number system. To specify a binary literal, add the prefix 0b or 0B to the integral value.

How do I read a binary file in C?

Use the fread Function to Read Binary File in C FILE* streams are retrieved by the fopen function, which takes the file path as the string constant and the mode to open them. The mode of the file specifies whether to open a file for reading, writing or appending.

Which format is used for reading a binary file?

To read from a binary file Use the ReadAllBytes method, which returns the contents of a file as a byte array. This example reads from the file C:/Documents and Settings/selfportrait. jpg .

Is Java file a binary file?

Java binary files are platform independent. They can be interpreted by any computer that supports Java. A stream is a device for transmitting or retrieving 8-bit or byte values. The emphasis is on the action of reading or writing as opposed to the data itself.


1 Answers

If you use a memory mapped file or regular buffer you will be able to read the data as fast your hardware allows.

File tmp = File.createTempFile("deleteme", "bin");
tmp.deleteOnExit();
int size = 1024 * 1024 * 1024;

long start0 = System.nanoTime();
FileChannel fc0 = new FileOutputStream(tmp).getChannel();
ByteBuffer bb = ByteBuffer.allocateDirect(32 * 1024).order(ByteOrder.nativeOrder());

for (int i = 0; i < size; i += bb.capacity()) {
    fc0.write(bb);
    bb.clear();
}
long time0 = System.nanoTime() - start0;
System.out.printf("Took %.3f ms to write %,d MB using ByteBuffer%n", time0 / 1e6, size / 1024 / 1024);

long start = System.nanoTime();
FileChannel fc = new FileInputStream(tmp).getChannel();
MappedByteBuffer buffer = fc.map(FileChannel.MapMode.READ_ONLY, 0, size);
LongBuffer longBuffer = buffer.order(ByteOrder.nativeOrder()).asLongBuffer();
long total = 0; // used to prevent a micro-optimisation.
while (longBuffer.remaining() > 0)
    total += longBuffer.get();
fc.close();
long time = System.nanoTime() - start;
System.out.printf("Took %.3f ms to read %,d MB MemoryMappedFile%n", time / 1e6, size / 1024 / 1024);

long start2 = System.nanoTime();
FileChannel fc2 = new FileInputStream(tmp).getChannel();
bb.clear();
while (fc2.read(bb) > 0) {
    while (bb.remaining() > 0)
        total += bb.get();
    bb.clear();
}
fc2.close();
long time2 = System.nanoTime() - start2;
System.out.printf("Took %.3f ms to read %,d MB File via NIO%n", time2 / 1e6, size / 1024 / 1024);

prints

Took 305.243 ms to write 1,024 MB using ByteBuffer
Took 286.404 ms to read 1,024 MB MemoryMappedFile
Took 155.598 ms to read 1,024 MB File via NIO

This is for a file 10x larger than what you want. Its this fast because the data is being cached in memory (and I have an SSD drive). If you have fast hardware, the data can be read pretty fast.

like image 137
Peter Lawrey Avatar answered Sep 27 '22 22:09

Peter Lawrey