Nio provides async io - meaning calling thread is not blocked on IO operations. However, i am still confused how this works internally? From this answer - there just thread pool that where synch IO is submitted.
Do jvm has thread pool where actually sync IO is performed? There is native AIO support for Linux - does java use it internally. How AIO works on OS level - does it have thread pool but on the OS level - or there is some magic that threads are not needed at all?
In general, the question is - do async NIO gives us the ability to get rig of threads-or it is just wrapper around sync IO that allows us to have fixed number of thread to perform IO
The kernel itself (be it windows or linux or something more exotic) is responsible for doing non-blocking I/O, and the java classes in the nio package (such as Channel, and Selector) are merely pretty low level translations of that API.
The low level stuff requires that you make threads in order to do it right. The basic NIO support in java.* itself lets you call a method which blocks until at least 1 thing you're interested in happens to any number of batched-together non-blocking channels. You could, for example, have 1000 open channels representing network sockets all waiting for 'I am interested if some network packets arrive on any of these 1000 open sockets', and then call a method to say: "Please sleep until something interesting happens". If you set up your app to call this method, then handle all interesting things, and go back to calling this method, you've written a rather inefficient application: CPUs tend to have far more than one core, and all but one are asleep doing absolutely nothing. the proper model is to have multiple threads (more or less one per core) all running the same 'wake me up with a list of interesting things' model. You can't get rid of threads unless you make intentionally badly performing code.
So, lets hypothetically say that you've set it up right: You have an 8-core CPU and you have 8 threads running the 'wait-for-interesting-stuff, handle-sockets-with-active-data' loop.
Imagine a part of your handle-sockets code blocks. That is, it does something which will cause the CPU to go check for other jobs to do, because it has to wait, say, for network, or disk, or some such. Let's say because you've put some database queries in there and you did not realize that DB queries use (possibly local, but still) networking and hit the disk. That'd be really bad: You have CPU resources aplenty to deal with those 1000 incoming requests, but your entire set of 8 threads are all waiting for the DB to do stuff, and whilst the CPU could be analysing packets and responses, it's got nothing left to do whatsoever and throttles down waiting for the ages it takes for the DB to fetch a record from disk.
Bad. So, do NOT call blocking code. Unfortunately, there are tons of methods in java (both in the java core libs as well as third party libraries) which block. They tend not to be documented. There's no real solution to this.
Some libraries do offer solutions, but if they do, it has to be in the 'callback' form: Take the DB query example: What you'd have to do is take that network socket, tell it that you're, at least for now, no longer interested in incoming data (you're already waiting for the DB to respond, there's no point in trying to process more incoming data for this socket); instead you want to associate (and the NIO api doesn't support this itself, you'd have to build some sort of framework) the DB connection itself as 'I am interested if this DB query has a response ready'. Java as a language does not lend itself to writing this way, you end up with 'callback hell', which is how javascript works. There are solutions to callback hell but it remains complicated, and java basically does not support them (for example, 'yield' is a thing that can help. Java does not support the yield concept).
Finally, there is performance: WHY do you want to get rid of threads?
Threads incur 2 major penalties:
The context switch. When the CPU has to jump to another thread (because the thread it was on needs to wait for disk or network data and thus has nothing to do right now), it needs to jump to another code location and figure out which memory tables to load into cache to run it.
The stack. Like just about every programming model, there is a bit of memory called 'the stack' which contains local variables and the location of the method that called you (and the method that called it, all the way down to your main method / Thread run method). If you get a stacktrace, you're looking at the effect of it. In java, every thread gets 1 stack, and all stacks are the same size. You can configure it with the -Xss
JVM argument and the minimum value is 1MB. Meaning, if you want 4000 threads simultaneously, that's 4GB worth of stack and that cannot be avoided (and then you need more memory for heap and such on top of this).
But, non-blocking isn't that much of a solution to either of those problems:
When moving to another handler because you've run out of data to process, you... also context switch. It's not a thread switch, but you still need to hop to a completely different page of memory, and on modern architecture, accessing a part of memory that isn't in the caches takes a long time. You're just trading in 'thread context switch' for 'memory page cache context switch', and you've gained nothing.
Let's say you're some sort of chat app, and you've received from one of the connected clients a message to send. You now need to query the DB to see if this user has the rights to post this message to the chat channel it intends to send it to, and also to see if there are any other follow-mode devices you need to update. Because that is a blocking operation you want to hop to another job whilst you wait. But you need to remember this state someplace: The sending user, the message, the results of your DB queries. In the threaded model, this data is automatically and implicitly taken care of for you: It's in that stack space. If you go full NIO, you need to manage this yourself, for example with ByteBuffers.
Yes, when you manually get to control the bytebuffers, you can make them precisely as large as they need to be, and generally that'll be far smaller than 1MB, so you can handle more simultaneous connections this way. or, you just toss a 64GB stick of RAM in your server.
The pragmatic upshot, then, is this:
NIO code is extremely difficult to write. Use abstractions like grizzly or netty because it's rocket science.
It's rarely faster.
You can have more simultaneous things going on, if the amount of data that needs to be tracked for connection/file/job/etc is low.
It's a bit like using assembler instead of C because you can technically squeeze more performance out of manually doing garbage collection instead of letting java do it for you. But there is a reason most people do not use assembler to program stuff in, even though it is theoretically faster. There is a reason the vast majority of web apps are written in java, or python, or node.js, or something else high level, and not an unmanaged language like C(++) or assembler.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With