Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is writing data to disk as fast as keeping it in-memory?

Tags:

java

io

I have the following 10000000x2 matrix:

0        0
1        1
2        2
..       ..
10000000 10000000

Now I want to save this matrix to int[][] array:

import com.google.common.base.Stopwatch;

static void memory(int size) throws Exception {
    System.out.println("Memory");

    Stopwatch s = Stopwatch.createStarted();

    int[][] l = new int[size][2];
    for (int i = 0; i < size; i++) {
        l[i][0] = i;
        l[i][1] = i;
    }

    System.out.println("Keeping " + size + " rows in-memory: " + s.stop());
}

public static void main(String[] args) throws Exception {
    int size = 10000000;
    memory(size);
    memory(size);
    memory(size);
    memory(size);
    memory(size);
}

The output:

Keeping 10000000 rows in-memory: 2,945 s
Keeping 10000000 rows in-memory: 408,1 ms
Keeping 10000000 rows in-memory: 761,5 ms
Keeping 10000000 rows in-memory: 543,7 ms
Keeping 10000000 rows in-memory: 408,2 ms

Now I want to save this matrix to disk:

import com.google.common.base.Stopwatch;
import java.io.BufferedOutputStream;
import java.io.FileOutputStream;

static void file(int size, int fileIndex) throws Exception {
    Stopwatch s = Stopwatch.createStarted();

    FileOutputStream outputStream = new FileOutputStream("D:\\file" + fileIndex);
    BufferedOutputStream buf = new BufferedOutputStream(outputStream);
    for (int i = 0; i < size; i++) {
        buf.write(bytes(i));
        buf.write(bytes(i));
    }

    buf.close();
    outputStream.close();

    System.out.println("Writing " + size + " rows: " + s.stop());
}

public static void main(String[] args) throws Exception {
    int size = 10000000;
    file(size, 1);
    file(size, 2);
    file(size, 3);
    file(size, 4);
    file(size, 5);
}

The output:

Writing 10000000 rows: 715,8 ms
Writing 10000000 rows: 636,6 ms
Writing 10000000 rows: 614,6 ms
Writing 10000000 rows: 598,0 ms
Writing 10000000 rows: 611,9 ms

Shouldn't be saving to memory much faster?

like image 308
ZhekaKozlov Avatar asked Jul 31 '14 06:07

ZhekaKozlov


2 Answers

As said in the comments, you're not measuring anything useful. The JVM caches the write operation in its memory, which it then flushes to the operating system, which caches it in its memory before finally writing it to disk at some point.
But you're only measuring the time it takes the JVM to cache it in its own memory (which is all you can measure).

Anyway, you shouldn't bother with such micro optimisations.

like image 72
jwenting Avatar answered Sep 21 '22 19:09

jwenting


Your hard drive and operating system employ write buffering so that your system can continue operation in the face of multiple concurrent tasks (for example, programs reading and writing the disk). This can (and sometimes does) lead to data loss in the event of power failure on desktop class machines. Servers and laptops can also experience the issue (but usually employ sophisticated technology called a battery to mitigate the chances). Anyway, on Linux you might have to fsck and on Windows you might chkdsk when it happens.

like image 37
Elliott Frisch Avatar answered Sep 18 '22 19:09

Elliott Frisch