Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why use Java's AsynchronousFileChannel?

Tags:

java

nio

I can understand why network apps would use multiplexing (to not create too many threads), and why programs would use async calls for pipelining (more efficient). But I don't understand the efficiency purpose of AsynchronousFileChannel.

Any ideas?

like image 699
Joey Bell Avatar asked May 04 '10 05:05

Joey Bell


4 Answers

It's a channel that you can use to read files asynchronously, i.e. the I/O operations are done on a separate thread, so that the thread you're calling it from can do other things while the I/O operations are happening.

For example: The read() methods of the class return a Future object to get the result of reading data from the file. So, what you can do is call read(), which will return immediately with a Future object. In the background, another thread will read the actual data from the file. Your own thread can continue doing things, and when it needs the read data, you call get() on the Future object. That will then return the data (if the background thread hasn't completed reading the data, it will make your thread block until the data is ready). The advantage of this is that your thread doesn't have to wait the whole length of the read operation; it can do some other things until it really needs the data.

See the documentation.

Note that AsynchronousFileChannel will be a new class in Java SE 7, which is not released yet.

like image 121
Jesper Avatar answered Nov 13 '22 11:11

Jesper


I've just come across another, somewhat unexpected reason for using AsynchronousFileChannel. When performing random record-oriented writes across large files (exceeding physical memory so caching isn't helping everything) on NTFS, I find that AsynchronousFileChannel performs over twice as many operations, in single-threaded mode, versus a normal FileChannel.

My best guess is that because the asynchronous io boils down to overlapped IO in Windows 7, the NTFS file system driver is able to update its own internal structures faster when it doesn't have to create a sync point after every call.

I micro-benchmarked against RandomAccessFile to see how it would perform (results are very close to FileChannel, and still half of the performance of AsynchronousFileChannel.

Not sure what happens with multi-threaded writes. This is on Java 7, on an SSD (the SSD is an order of magnitude faster than magnetic, and another order of magnitude faster on smaller files that fit in memory).

Will be interesting to see if the same ratios hold on Linux.

like image 25
Ross Judson Avatar answered Nov 13 '22 12:11

Ross Judson


The main reason I can think of to use asynchronous IO is to better utilize the processor. Imagine you have some application which does some sort of processing on a file. And also let's assume you can process the data contained in the file in chunks. If you don't make use of asynchronous IO then your application will probably behave something like this:

  1. Read a block of data. No processor utilization at this point as you're blocked waiting for the data to be read.
  2. process the data you just read. At this point your application will start consuming CPU cycles as it processed the data.
  3. If more data to read, goto #1.

The processor utilization will go up and then to zero and then up and then to zero, ... . Ideally you want to not be idle if you want your application to be efficient and process the data as fast as possible. A better approach would be:

  1. Issue async read
  2. When read completes issue next async read and then process data

The first step is the bootstrapping. You have no data yet so you have to issue a read. From then on, when you get notified a read has completed you issue another async read and then process the data. The benefit here is that by the time you finish processing the chunk of data the next read has probably finished, so you always have data available to process and thus you're more efficiently using the processor. If your processing finishes before the read has finished you might need to issue multiple asynchronous reads so that you have more data to process.

Nick

like image 41
nickdu Avatar answered Nov 13 '22 12:11

nickdu


Here's something no one has mentioned:

A plain FileChannel implements InterruptibleChannel so it, as well as anything that uses it such as the OutputStream returned by Files.newOutputStream(), has the unfortunate[1][2] behaviour that performing any blocking operation on it (e.g. read() and write()) in a thread in interrupted state will cause the Channel itself to close with java.nio.channels.ClosedByInterruptException.

If this is a problem, using AsynchronousFileChannel instead is a possible alternative.

  • [1]http://bugs.java.com/bugdatabase/view_bug.do?bug_id=6608965
  • [2]https://bugs.openjdk.java.net/browse/JDK-4469683
like image 31
antak Avatar answered Nov 13 '22 11:11

antak