Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do I see a performance degradation in network traffic between Java 8 and Java 21?

We're creating a heavy-load network-traffic-centric application and run those server quite successful for many, many years under Java 8. Network-traffic-centric means that quite often the server has to handle up to 700 MBit/s.

Now we'd like to switch to Java 21.

I can confirm that Java 13 behaves performance-wise like Java 8 while Java 21 behaves like Java 14. So a change obviously took place from Java 13 to Java 14. I did my tests using Azul Zulu but also tried another implementation to assure it's not a problem of Zulu.

While evaluating we saw, that Java 21 behaves worse performance-wise than Java 8 which surprised us quite a lot .

I created a sample in which you can see the effect:

Main class

package senderreceiverbenchmark;

import java.io.*;
import java.net.*;
import java.util.concurrent.*;

public class SenderReceiverBenchmark
{
    public static void main(String[] args) throws IOException
    {
        ScheduledExecutorService executorService = Executors.newSingleThreadScheduledExecutor();
        Statistics statistics = null;
        
        switch (args.length)
        {
            case 1: //receiver mode
            {
                System.out.println( "Receiver waiting at port " + Integer.valueOf(args[0]));
        
                statistics = new Statistics("Received");
                executorService.scheduleAtFixedRate(statistics, 10, 10, TimeUnit.SECONDS);
        
                ServerSocket serverSocket = new ServerSocket(Integer.parseInt(args[0]));
                
                ExecutorService executorServiceReceiver = Executors.newCachedThreadPool();
                
                Socket socket;
                while((socket = serverSocket.accept()) != null)
                {
                    executorServiceReceiver.submit(new Receiver(socket.getInputStream(), statistics));
                }
                
                break;
            }
            case 4: //sender mode
            {
                System.out.println( "Sending to " + args[0] + ":" + Integer.valueOf(args[1]) + " with [" + Integer.valueOf(args[2]) + "] connections and framesize [" + Integer.valueOf(args[3]) + " KB]");
        
                statistics = new Statistics("Send");
                executorService.scheduleAtFixedRate(statistics, 10, 10, TimeUnit.SECONDS);
        
                ExecutorService executorServiceSender = Executors.newFixedThreadPool(Integer.parseInt(args[2]));
                long SLEEP_TIME_BETWEEN_SENDING = 50;
                for (int i = 0; i < Integer.parseInt(args[2]); i++) //creating independant sender ...
                {
                    executorServiceSender.submit(new Sender(args[0], Integer.parseInt(args[1]), Integer.parseInt(args[3]), SLEEP_TIME_BETWEEN_SENDING, statistics));
                }
                
                break;
            }
            default:
                System.out.println( "For Receiver use: LoopbackBenchmark <ServerSocket>" );
                System.out.println( "For Sender use: LoopbackBenchmark <host> <port> <NumberOfConnections> <Framesize KB>" );
                System.exit(-1);
                break;
        }
    }
}

Sender:

package senderreceiverbenchmark;

import java.io.*;
import java.net.Socket;
import java.net.SocketException;
import java.util.concurrent.Callable;

public class Sender implements Callable<Object>
{
    private final OutputStream outputStream;
    private final Statistics statistics;
    private final byte[] preallocatedRandomData = new byte[65535];
    private final long sleepTime;

    public Sender(String host, int port, int framesizeKB, long sleepTimeBetweenSend, Statistics statistics) throws SocketException, IOException
    {
        this.statistics = statistics;
        
        Socket socket = new Socket( host, port );

        outputStream = socket.getOutputStream();
        this.sleepTime = sleepTimeBetweenSend;
    }

    @Override
    public Object call() throws Exception
    {
        statistics.handledConections.addAndGet(1);
        
        while (true)
        {
            this.outputStream.write(preallocatedRandomData);
            statistics.overallData.addAndGet(preallocatedRandomData.length);
            Thread.sleep(sleepTime);
        }
    }
}

Receiver:

package senderreceiverbenchmark;

import java.io.*;
import java.util.concurrent.Callable;

public class Receiver implements Callable<Object>
{
    private final InputStream inputStream;
    private final Statistics statistics;
    private final byte[] buffer = new byte[65535];
    
    public Receiver(InputStream inputStream, Statistics statistics)
    {
        this.inputStream = inputStream;
        this.statistics = statistics;
    }
    
    @Override
    public Object call() throws Exception
    {
        statistics.handledConections.addAndGet(1);
        
        while (true)
        {
            int readBytes = this.inputStream.read(buffer);
            if( readBytes > 0 )
            {
                statistics.overallData.addAndGet(readBytes);
            }
        }
    }
}

A bit statistics:

package senderreceiverbenchmark;

import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicLong;

public class Statistics implements Runnable
{
    public final AtomicLong overallData = new AtomicLong(0L);
    public final AtomicLong handledConections = new AtomicLong(0L);
    private final String mode;
    private long previousRun = System.currentTimeMillis();
    
    public Statistics(String tag)
    {
        this.mode = tag;
    }

    @Override
    public void run()
    {
        long dataSentPerSecond = overallData.get() / TimeUnit.MILLISECONDS.toSeconds((System.currentTimeMillis() - previousRun));

        System.out.println(mode + ", Connections: " + handledConections.get() + ", Sent overall: " + dataSentPerSecond / (1024*1024) + " MB/s" );

        overallData.set(0);
        previousRun = System.currentTimeMillis();
    }
}

Forgive me the sample has no (good) error handling but should be fine for demonstration purposes.

  1. Now start first the receiver:

    Benchmark.bat 4711
    
  2. Then start the sender:

    Benchmark.bat 127.0.0.1 4711 300 128
    

This is now starting up 300 sender threads sending every 50ms a packet of 128KB data to the receiver.

When you first doing that with Java 8 as runtime and then with Java 21 as runtime you will see something like this:

CPU load Java 8 vs Java 21

The first half is showing the sample application running on Java 8, the second half on Java 21.
Compared to Java 8 the newer Java 21 needs 10%-15% more CPU power.

Can someone explain where this comes from and what I can do about it?

Update: As some of the commenters couldn't reproduce it I ask colleagues to run the sample to get a wider test range.

10 other guys beside of my own test DO SEE the effect very clearly. On 2 VMs and one physical machine I can't see the effect.

Any how I don't see a commondenominator whyit's there or not. CPU are from Intel/AMD, OS were Win 10, Win 11, Server 2012, Server 2019.

Also I tried beside the Azul Zulu builds the buildfrom MS and from OpenLogic but changing the builds had no effect.

Solution: The hint to JEP 353 pushed me into the right direction. I still don't get it why Java 13 behaves the same as Java 8 even there the JEP 353 was done but anyway this hint inspired me.

What I did was, that I changed my sample application above.

Instead of

ExecutorService executorServiceReceiver = Executors.newCachedThreadPool();

I used

ExecutorService executorServiceReceiver = Executors.newVirtualThreadPerTaskExecutor();

Same I did for executorServiceSender.

After that I see very clearly that Java 21 behaves better than Java 8.

Have a look to the screenshot: Black rectangle is Java 8, red rectangle is Java 21 with platform threads and green rectangle is Java 21 with virtual threads.

enter image description here

Needless to say the number of used platform/OS-Threads overall in the system is much lower. I

Thanks for all the constructive comments pushing me into the right direction.

like image 884
Christoph Weser Avatar asked Apr 10 '26 15:04

Christoph Weser


1 Answers

I have tried using InteliJ profiler to run your application several times but could not reproduce consistently any cpu performance issue as to make a solid case.

However in your example you use the

  • java.net.Socket
  • java.net.ServerSocket

The following changes could explain performance differences in specific scenarios while using JDK13 and later versions when compared to JDK8.

Socket and ServerSocket have been reimplemented in JDK13 according to JEP 353 as to prepare the ground for virtual threads of project loom. If you inspect close inside JEP 353, you will find the following:

Aside from behavioral differences, the performance of the new implementation may differ to the old when running certain workloads. In the old implementation several threads calling the accept method on a ServerSocket will queue in the kernel. In the new implementation, one thread will block in the accept system call, the others will queue waiting to acquire a java.util.concurrent lock. Performance characteristics may differ in other scenarios too.

like image 127
Panagiotis Bougioukos Avatar answered Apr 13 '26 05:04

Panagiotis Bougioukos



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!