Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

java.nio transferTo seems to be impossibly fast?

Tags:

java

nio

Can someone please explain how the transferTo method can copy a file at seemingly 1000+ MB/sec. I ran some tests using a 372MB binary file and the first copy is slow, but if I change the output name and run it again, an additional file appears in the output directory in as little as 180ms, which works out to over 2000 MB/sec. What's going on here? I'm running Windows 7.

private static void doCopyNIO(String inFile, String outFile) {
    FileInputStream     fis = null;
    FileOutputStream    fos = null;
    FileChannel         cis = null;
    FileChannel         cos = null;

    long                len = 0, pos = 0;

    try {
        fis = new FileInputStream(inFile);
        cis = fis.getChannel();
        fos = new FileOutputStream(outFile);
        cos = fos.getChannel();
        len = cis.size();
        while (pos < len) {
            pos += cis.transferTo(pos, (1024 * 1024 * 10), cos);    // 10M
        }
        fos.flush();
    } catch (Exception e) {
        e.printStackTrace();
    } finally {
        if (cos != null) { try { cos.close(); } catch (Exception e) { } }
        if (fos != null) { try { fos.close(); } catch (Exception e) { } }
        if (cis != null) { try { cis.close(); } catch (Exception e) { } }
        if (fis != null) { try { fis.close(); } catch (Exception e) { } }
    }
}
like image 574
haventchecked Avatar asked Jan 12 '23 11:01

haventchecked


2 Answers

The key there is "first time". Your OS has cached the entire file in RAM (372MB isn't much these days), and so the only overhead is the time required to flip the zero-copy buffers through the memory-mapped space. If you flush the cache (don't know how to do that on Windows; if the file's on an external drive, you could unplug and replug), you'll see that settle down to read rates, and if you force the OS to flush the writes, your program will block for 10 seconds or so if you have a hard disk.

like image 150
chrylis -cautiouslyoptimistic- Avatar answered Jan 24 '23 05:01

chrylis -cautiouslyoptimistic-


I'm guessing that once the file has been read once, the OS caches it in order to speed up subsequent reads. In addition, a feature from NTFS called Single Instance Storage might also play a part, as described in Wikipedia:

When there are several directories that have different but similar files, some of these files may have identical content. Single instance storage allows identical files to be merged to one file and create references to that merged file.

https://en.wikipedia.org/wiki/NTFS#Single_Instance_Storage_.28SIS.29

I'm not really sure if this is what you're seeing, but it the only thing I can think of that makes sense.

like image 25
bjelleklang Avatar answered Jan 24 '23 04:01

bjelleklang