Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is it that FileInputStream read is slower with bigger array

If I read bytes from a file into a byte[] I see that FileInputStream performance worse when the array is around 1 MB compared to 128 KB. On the 2 workstations I have tested it is almost twice as fast with 128 KB. Why is that?

import java.io.*;

public class ReadFileInChuncks 
{
    public static void main(String[] args) throws IOException 
    {
        byte[] buffer1 = new byte[1024*128];
        byte[] buffer2 = new byte[1024*1024];

        String path = "some 1 gb big file";

        readFileInChuncks(path, buffer1, false);

        readFileInChuncks(path, buffer1, true);
        readFileInChuncks(path, buffer2, true);
        readFileInChuncks(path, buffer1, true);
        readFileInChuncks(path, buffer2, true);
    }

    public static void readFileInChuncks(String path, byte[] buffer, boolean report) throws IOException
    {
        long t = System.currentTimeMillis();

        InputStream is = new FileInputStream(path);
        while ((readToArray(is, buffer)) != 0) {}

        if (report)
            System.out.println((System.currentTimeMillis()-t) + " ms");
    }

    public static int readToArray(InputStream is, byte[] buffer) throws IOException
    {
        int index = 0;
        while (index != buffer.length)
        {
            int read = is.read(buffer, index, buffer.length - index);
            if (read == -1)
                break;
            index += read;
        }
        return index;
    }
}

outputs

422 ms 
717 ms 
422 ms 
718 ms

Notice this is a redefinition of an already posted question. The other was polluted with unrelated discussions. I will mark the other for deletion.

Edit: Duplicate, really? I sure could make some better code to proof my point, but this does not answer my question

Edit2: I ran the test with every buffer between 5 KB and 1000 KB on
Win7 / JRE 1.8.0_25 and the bad performance starts at precis 508 KB and all subsequent. Sorry for the bad diagram legions, x is buffer size, y is milliseconds

enter image description here

like image 551
Stig Avatar asked Feb 24 '15 19:02

Stig


People also ask

What does the read () method of FileInputStream return?

The read() method of a FileInputStream returns an int which contains the byte value of the byte read. If the read() method returns -1, there is no more data to read in the stream, and it can be closed. That is, -1 as int value, not -1 as byte value.

Why is BufferedInputStream fast?

With a BufferedInputStream , the method delegates to an overloaded read() method that reads 8192 amount of bytes and buffers them until they are needed. It still returns only the single byte (but keeps the others in reserve). This way the BufferedInputStream makes less native calls to the OS to read from the file.

What is the difference between FileReader and FileInputStream?

FileInputStream is Byte Based, it can be used to read bytes. FileReader is Character Based, it can be used to read characters. FileInputStream is used for reading binary files. FileReader is used for reading text files in platform default encoding.

What is the difference between FileInputStream and Fileoutpustream?

InputStream − This is used to read (sequential) data from a source. OutputStream − This is used to write data to a destination.


2 Answers

TL;DR The performance drop is caused by memory allocation, not by file reading issues.

A typical benchmarking problem: you benchmark one thing, but actually measure another.

First of all, when I rewrote the sample code using RandomAccessFile, FileChannel and ByteBuffer.allocateDirect, the threshold has disappeared. File reading performance became roughly the same for 128K and 1M buffer.

Unlike direct ByteBuffer I/O FileInputStream.read cannot load data directly into Java byte array. It needs to get data into some native buffer first, and then copy it to Java using JNI SetByteArrayRegion function.

So we have to look at the native implementation of FileInputStream.read. It comes down to the following piece of code in io_util.c:

    if (len == 0) {
        return 0;
    } else if (len > BUF_SIZE) {
        buf = malloc(len);
        if (buf == NULL) {
            JNU_ThrowOutOfMemoryError(env, NULL);
            return 0;
        }
    } else {
        buf = stackBuf;
    }

Here BUF_SIZE == 8192. If the buffer is larger than this reserved stack area, a temporary buffer is allocated by malloc. On Windows malloc is usually implemented via HeapAlloc WINAPI call.

Next, I measured the performance of HeapAlloc + HeapFree calls alone without file I/O. The results were interesting:

     128K:    5 μs
     256K:   10 μs
     384K:   15 μs
     512K:   20 μs
     640K:   25 μs
     768K:   29 μs
     896K:   33 μs
    1024K:  316 μs  <-- almost 10x leap
    1152K:  356 μs
    1280K:  399 μs
    1408K:  436 μs
    1536K:  474 μs
    1664K:  511 μs
    1792K:  553 μs
    1920K:  592 μs
    2048K:  628 μs

As you can see, the performance of OS memory allocation drastically changes at 1MB boundary. This can be explained by different allocation algorithms used for small chunks and for large chunks.

UPDATE

The documentation for HeapCreate confirms the idea about specific allocation strategy for blocks larger than 1MB (see dwMaximumSize description).

Also, the largest memory block that can be allocated from the heap is slightly less than 512 KB for a 32-bit process and slightly less than 1,024 KB for a 64-bit process.

...

Requests to allocate memory blocks larger than the limit for a fixed-size heap do not automatically fail; instead, the system calls the VirtualAlloc function to obtain the memory that is needed for large blocks.

like image 89
apangin Avatar answered Sep 29 '22 10:09

apangin


Optimal buffer size depands on file system block size, CPU cache size and cache latency. Most os'es use block size 4096 or 8192 so it is recommended to use buffer with this size or multiplicity of this value.

like image 35
Piotr Zych Avatar answered Sep 29 '22 10:09

Piotr Zych