I was investigating how Project Loom works and what kind of benefits it can bring to my company.
So I understand the motivation, for standard servlet based backend, there is always a thread pool that executes a business logic, once thread is blocked because of IO it can't do anything but wait. So let's say I have a backend application that has single endpoint , the business logic behind this endpoint is to read some data using JDBC which internally uses InputStream which again will use blocking system call( read() in terms of Linux). So if I have 200 hundred users reaching this endpoint, I need to create 200 threads each waiting for IO.
Now let's say I switched a thread pool to use virtual threads instead. According to Ben Evans in the article Going inside Java’s Project Loom and virtual threads:
Instead, virtual threads automatically give up (or yield) their carrier thread when a blocking call (such as I/O) is made.
So as far as I understand, if I have amount of OS threads equals to amount of CPU cores and unbounded amount of virtual threads, all OS threads will still wait for IO and Executor service won't be able to assign new work for Virtual threads because there are no available threads to execute it. How is it different from regular threads , at least for OS threads I can scale it to thousand to increase the throughput. Or Did I just misunderstood the use case for Loom ? Thanks in advance
I just read this mailing list:
Virtual threads love blocking I/O. If the thread needs to block in say a Socket read then this releases the underlying kernel thread to do other work
I am not sure I understand it, there is no way for OS to release the thread if it does a blocking call such as read, for these purposes kernel has non blocking syscalls such as epoll which doesn't block the thread and immediately returns a list of file descriptors that have some data available. Does the quote above implies that under the hood , JVM will replace a blocking read
with non blocking epoll
if thread that called it is virtual ?
Your first excerpt is missing the important point:
Instead, virtual threads automatically give up (or yield) their carrier thread when a blocking call (such as I/O) is made. This is handled by the library and runtime [...]
The implication is this: if your code makes a blocking call into the library (for example NIO) the library detects that you call it from a virtual thread and will turn the blocking call into a non-blocking call, park the virtual thread and continue processing some other virtual threads code.
Only if no virtual thread is ready to execute will a native thread be parked.
Note that your code never calls a blocking syscall, it calls into the java libraries (that currently execute the blocking syscall). Project Loom replaces the layers between your code and the blocking syscall and can therefore do anything it wants - as long as the result for your calling code looks the same.
I finally found an answer . So as I said , by default InputStream.read
method makes a read()
syscall which according to Linux man pages will block the underling OS thread. So how is it possible that Loom won't block it ? I found an article that shows the stacktrace So if this block of code will be executed by virtual thread
URLData getURL(URL url) throws IOException {
try (InputStream in = url.openStream()) {//blocking call
return new URLData(url, in.readAllBytes());
}
}
JVM runtime will transform it into the following stacktrace
java.base/jdk.internal.misc.VirtualThreads.park(VirtualThreads.java:60)//this line parks the virtual thread
java.base/sun.nio.ch.NioSocketImpl.park(NioSocketImpl.java:184)
java.base/sun.nio.ch.NioSocketImpl.park(NioSocketImpl.java:212)
java.base/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:356)//JVM runtime will replace an actual read() into read from java nio package
java.base/java.io.InputStream.readAllBytes(InputStream.java:346)
How JVM knows when to unpark the virtual thread ? Here is the stacktrace that will be ran once readAllBytes
is finished
"Read-Poller" #16
java.base@17-internal/sun.nio.ch.KQueue.poll(Native Method)
java.base@17-internal/sun.nio.ch.KQueuePoller.poll(KQueuePoller.java:65)
java.base@17-internal/sun.nio.ch.Poller.poll(Poller.java:195)
The author of the article uses MacOs, Mac uses kqueue
as non blocking syscall, If I run it on Linux, I would see epoll
syscall.
So basically Loom doesn't introduce anything new, under the hood it's a plain epoll
syscall with callbacks which can be implelemented using a framework such as Vert.x that uses Netty under the hood, but in Loom the callback logic is encapsulated with the JVM runtime which I found counter intuitive, when I call InputStream.read() I do expect a corresponding read() syscall, but JVM will replace it with non blocking syscalls.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With