Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spring boot cold start

I have a spring boot application which I'm running inside docker containers in an openshift cluster. In steady state, there are N instances of the application (say N=5) and requests are load balanced to these N instances. Everything runs fine and response time is low (~5ms with total throughput of ~60k).

Whenever I add a new instance, response time goes up briefly (upto ~70ms) and then comes back to normal.

Is there anything I can do to avoid this type of cold start? I tried pre-warming the app by making ~100 curl calls sequentially before sending traffic, but that did not help?

Do I need better warmup script with high concurrency? Is there a better way to handle this?

Thanks

like image 489
Vikk Avatar asked Feb 19 '19 05:02

Vikk


People also ask

What is JVM cold start?

For Java managed runtimes, a new JVM is started and your application code is loaded. This is called a cold start. Subsequent requests then reuse this execution environment. This means that the Init phase does not need to run again. The JVM will already be started.

Why SpringBootRequestHandler is deprecated?

Please note that the SpringBootRequestHandler handler is deprecated since Spring Cloud Function 3.1. 0 in favor of the FunctionInvoker handler. The input type is Void in this case, as we don't care about any incoming value and just produce a random String .

How long is lambda cold start?

The duration of a cold start varies from under 100 ms to over 1 second. Since the Lambda service reuses warmed environments for subsequent invocations, cold starts are typically more common in development and test functions than production workloads.


1 Answers

This problem can be solved from two aspects. The first method is to warm up yourself before serving. The second method is to give fewer requests from the outside at the beginning, so that more computing resources can be reserved to complete some initialization of the JVM (such as class loading). Either way, it is because the JVM needs to be warmed up for startup. This is determined by the operating principle of the JVM. Especially the HotSpot virtual machine, its execution engine consists of two parts: the interpretation execution engine and the real-time compilation execution (JIT). For JIT, which requires CPU resources to compile the bytecode in real time. In addition, lazy loading of classes will also require more time on the first run.

  1. JVM warm-up

JVM warm-up is mainly to solve the two problems of class loading and real-time compilation.

  • For class loading, just run the override code path ahead of time.
  • For JIT, layered compilation such as C1/C2 is generally enabled on the server side (JVM server mode). If you are using JDK7 or above, layered compilation is enabled by default (lower versions of JDK require JVM parameters: -XX:+TieredCompilation), the compilation computing resources of C1 and C2 are different, and C2 will have more. The purpose of preheating may be to trigger C1/C2 compilation, so that when the official request comes in, the code has been preheated and compiled pass.

For the above two directions, class loading itself will consume more time. Warming up this part will get a greater input-output ratio.

  1. Network layer warm-up.

From the network level, a certain amount of warm-up traffic is given, which can be a specific warm-up traffic or a normal user request.

This can generally be done at the nginx layer for flow control. When a newly started node joins the upstream, a very low weight can be given to the new node. In this way, only a small amount of traffic is entered in the initial stage. Thus, enough computing resources are reserved. To do code warm-up, that is, class loading and just-in-time compilation. If the service only provides RPC services, but not HTTP services, the RPC framework layer can do traffic preheating. For example, RPC frameworks such as Dubbo already have the service preheating function. Similarly, preheating means that the nodes in the initial stage of startup only give a small amount of traffic .

The computing resources required for preheating are mentioned in the above methods. That is, CPU. If your service host has enough computing resources, you can allocate more CPU resources to each node to speed up the preheating process. Reduce Preheat treatment time.

If the above network layer and hardware resources and RPC framework cannot be changed. We can warm up ourselves inside the SpringBoot microservice. The above answers have already mentioned ApplicationReadyEvent, the actual better implementation is to listen to the ContextRefreshedEvent event. Because the HTTP port will be initialized and exposed when the ApplicationReadyEvent occurs. There may be unexpected requests coming in before the warm-up is completed.

@Component
public class StartWarmUpListener implements ApplicationListener<ContextRefreshedEvent> {
    /**
     * Handle an application event.
     *
     * @param event the event to respond to
     */
    @Override
    public void onApplicationEvent(ContextRefreshedEvent event) {
        // do something about warm-up here.....
    }
}

Note: The above warm-up code does not warm up all the code. Because the request from the Controller layer has some code paths that cannot actually be executed until the HTTP server is not ready. We can only perform code coverage at the service layer. In short, this may be a compromise.

like image 184
user3033075 Avatar answered Sep 18 '22 15:09

user3033075