Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the fastest way to load a big 2D int array from a file?

I'm loading a 2D array from file, it's 15,000,000 * 3 ints big (it will be 40,000,000 * 3 eventually). Right now, I use dataInputStream.readInt() to sequentially read the ints. It takes ~15 seconds. Can I make it significantly (at least 3x) faster or is this about as fast as I can get?

like image 222
fhucho Avatar asked Dec 09 '22 14:12

fhucho


1 Answers

Yes, you can. From benchmark of 13 different ways of reading files:

If you have to pick the fastest approach, it would be one of these:

  • FileChannel with a MappedByteBuffer and array reads.
  • FileChannel with a direct ByteBuffer and array reads.
  • FileChannel with a wrapped array ByteBuffer and direct array access.

For the best Java read performance, there are 4 things to remember:

  • Minimize I/O operations by reading an array at a time, not a byte at a time. An 8 KB array is a good size (that's why it's a default value for BufferedInputStream).
  • Minimize method calls by getting data an array at a time, not a byte at a time. Use array indexing to get at bytes in the array.
  • Minimize thread synchronization locks if you don't need thread safety. Either make fewer method calls to a thread-safe class, or use a non-thread-safe class like FileChannel and MappedByteBuffer.
  • Minimize data copying between the JVM/OS, internal buffers, and application arrays. Use FileChannel with memory mapping, or a direct or wrapped array ByteBuffer.
like image 135
Adam Stelmaszczyk Avatar answered Dec 11 '22 03:12

Adam Stelmaszczyk