I am doing some experiments with my thesis involving the cold start problem that occurs with containers. My test application is a spring boot application that is build on the openjdk image. The first thing I want to try to resolve the problem of cold start, is the following:
Have a container ready, in the container is the openjdk and the libraries that the springboot app uses. I start my other container, using the ipc and networknamespace of the already existing container, and then be able to use the openjdk and the libraries of this container to run the jar file.
I am not exactly sure on how to achieve this? Can i achieve this by using volumes or should i be looking for a completely different approach?
On a another note, if i want x containers to run, i will make sure there are x pre-existing containers running. This is to make sure that every container has its own specific librarycontainer to work with. Would this be okay ?
In short, any way that I can speed up the spring boot application by using a second containers that is connected through ipc/net; would be helpfull to my problem.
You can use the Java 11 image from hub.docker.com by typing the command :docker pull openjdk:tag on your machines terminal, where the tag is version of your intended java version. Or you can simply specify the image on your Dockerfile where FROM attribute must be the version of java.
-XX:+UseContainerSupport is used to allocate a larger fraction of memory. To prevent the JVM adjusting the maximum heap size when running in a container, set -XX:-UseContainerSupport .
Spring boot is a purely "runtime" framework.
If I've got your question right, you describe the following situation:
So, say you have a container A with JDK and some jars. This alone doesn't mean that you have a running process though. So its more like a volume with files ready to be reused (or maybe a layer in terms of docker images).
In addition you have another container B with a spring boot application that should be started somehow (probably with the open jdk from container A or its dedicated JDK).
Now what exactly would like like to "speed up"? The size of image (smaller image means faster deployment in CI/CD pipeline for example)? The spring boot application startup time (the time interval between the point of spawning the JVM till the Spring boot application is up and running)? Or maybe you're trying to load less classes in runtime?
The techniques that solve the arised issues are different. But all in all I think you might need to check out the Graal VM integration that among other things might create native images and speed up the startup time. This stuff is fairly new, I by myself haven't tried this yet. I believe its work-in-progress and spring will put an effort on pushing this forward (this is only my speculation so take it with a grain of salt).
Anyway, you may be interested in reading this article
However, I doubt it has something to do with your research as you've described it.
Update 1
Based on your comments - let me give some additional information that might help. This update contains information from the "real-life" working experience and I post it because it might help to find directions in your thesis.
So, we have a spring boot application on the first place.
By default its a JAR and its Pivotal's recommendation there is also an option of WARs(as Josh Long, their developer advocate says: "Make JAR not WAR")
This spring boot application usually includes some web server - Tomcat for traditional Spring Web MVC applications by default, but you can switch it to Jetty, or undertow. If you're running a "reactive applcation" (Spring WebFlux Supported since spring boot 2) your default choice is Netty.
One side note that not all the spring boot driven applications have to include some kind of embedded web server, but I'll put aside this subtle point since you seem to target the case with web servers (you mention tomcat, a quicker ability to serve requests etc, hence my assumption).
Ok, now lets try to analyze what happens when you start a spring boot application JAR.
First of all the JVM itself starts - the process is started, the heap is allocated, the internal classes are loaded and so on and so forth. This can take some time (around a second or even slightly more depending on server, parameters, the speed of your disk etc). This thread address the question whether the JVM is really slow to start I probably won't be able to add more to that.
Ok, So now, its time to load the tomcat internal classes. This is again can take a couple of seconds on modern servers. Netty seems to be faster, but you can try to download a stanalone distribution of tomcat and start it up on your machine, or create a sample application wihout spring boot but with Embedded Tomcat to see what I'm talking about.
So far so good, not comes our application. As I said in the beginning, spring boot is purely runtime framework. So The classes of spring/spring boot itself must be loaded, and then the classes of the application itself. If The application uses some libraries - they'll be also loaded and sometimes even custom code will be executed during the application startup: Hibernate may check schema and/or scan db schema definitions and even update the underlying schema, Flyway/Liquidbase can execute schema migrations and what not, Swagger might scan controllers and generate documentation and what not.
Now this process in "real life" can even take a minute and even more, but its not because of the spring boot itself, but rather from the beans created in the application that have some code in "constructor"/"post-construct" - something that happens during the spring boot application context initialization. Another side note, I won't really dive into the internals of spring boot application startup process, spring boot is an extremely power framework that has a lot of things happening under the hood, I assume you've worked with spring boot in one way or another - if not, feel free to ask concrete questions about it - I/my colleagues will try to address.
If you go to start.spring.io can create a sample demo application - it will load pretty fast. So it all depends on your application beans.
In this light, what exactly should be optimized?
You've mentioned in comments that there might be a tomcat running with some JARs so that they won't be loaded upon the spring boot application starts.
Well, like our colleagues mentioned, this indeed more resembles a "traditional" web servlet container/application server model that we, people in the industry, "used for ages" (for around 20 years more or less).
This kind of deployment indeed has an "always up-and-running" a JVM process that is "always" ready to accept WAR files - a package archive of your application.
Once it detects the WAR thrown into some folder - it will "deploy" the application by means of creating the Hierarchical class-loader and loading up the application JARs/classes. Whats interesting in your context is that it was possible to "share" the libraries between multiple wars so that were loaded only once. For example, if your tomcat hosts, say, 3 applications (read 3 WARs) and all are using, oracle database driver, you can put this driver's jar to some shared libs
folder and it will be loaded only once by the class loader which is a "parent" for the class loaders created per "WAR". This class loader hierarchy thing is crucial, but I believe its outside the scope of the question.
I used to work with both models (spring boot driven with embedded server, an application without spring boot with embedded Jetty server and "old-school" tomcat/jboss deployments ).
From my experience, and, as time proves, many of our colleagues agree on this point, spring boot applications are much more convenient operational wise for many reasons (again, these reasons are out of scope for the question IMO, let me know if you need to know more on this), that's why its a current "trend" and "traditional" deployments are still in the industry because or many non pure technical reasons (historical, the system is "defined to be" in the maintenance mode, you already have a deployment infrastructure, a team of "sysadmins" that "know" how to deploy, you name it, but bottom line nothing purely technical).
Now with all this information you probably understand better why did I suggest to take a look at Graal VM that will allow a faster application startup by means of native images.
One more point that might be relevant. If you're choosing the technology that will allow a fast startup, probably you're into Amazon Lambda or the alternative offered by other cloud providers these days.
This model allows virtually infinite scalability of the "computational" power (CPU) and under the hood they "start" containers and "kill" them immediately once they detect that the container does actually nothing. For this kind of application spring boot simple is not a good fit, but so is basically Java, again, because the JVM process is relatively slow-to-start, so once they start the container like this it will take too long till the time it becomes operational.
You can read Here about what spring ecosystem has to offer at this field, but its not really relevant to your question (I'm trying to provide directions).
Spring boot shines when you need an application that might take some time to start, but once it starts it can do its job pretty fast. And yes, its possible to stop the application (we use the term scale out/scale in) if its not "occupied" by doing an actual work, this approach is also kind of new (~3-4 years) and works best in "managed" deployment environments like kubernetes, amazon ECS, etc.
So if speed-up application start is your goal i think you would need a different approach here a summary of why i think so:
docker: a container is a running instance of an image, you can see an image as a filesystem (actually is more than that but we are talking about libraries). In a container you have jdk (and i guess your image is based on tomcat). Docker engine has a very well designed cache system so containers starts very quickly, if no changes are made on a container docker only need to retrieve some info from a cache. These containers are isolated and for good reasons (security, modularity and talking about libraries isolation let you have more version of a library in different containers). Volumes do not what you think, they are not designed to share libraries, they let you break isolation to make some things for example you can create a volume for your codebase so you have not to rebuild an image for each change during the programming phase, but usually you won't see them in a production environment (maybe for some config files).
java/spring: spring is a framework based on java, java is based on a jdk and java code runs on a vm. So to run a java program you have to start that vm(no other way to do that) and of course you cannot reduce this startup time. Java environment is very powerfull but this is why a lot of people prefer nodejs expecially for little services, java code is slow in startup (minutes versus seconds). Spring as said before is based on java, servelets and context. Spring application lives in that context, so to run a spring application you have to initialize that context.
You are running a container, on top of that you are running a vm, then you are initializing a spring context and finally you are initializing beans of your application. These steps are sequential for dependencies reasons. You cannot initialize docker,vm and a spring context and run somewhere else your application, for example if you in a spring application add a chainfilter you would need to restart application because you would need to add a servlet to your system. If you want to speed-up the process of startup you would need to change the java vm or make some changes in the spring initialization. In summary you are tryng to deal with this problem at a high level instead of low level.
To answer your first question:
I am not exactly sure on how to achieve this? Can I achieve this by using volumes or should I be looking for a completely different approach?
This has to be balanced with the actually capabilities of your infrastructure.
One thing is, if you care about image and layer size this is good and this is definitely a good practice advised by Docker, but all depends of your needs. The recommendation about keeping images and layers small if for images that you'll distribute. If this is your own image for your own application, then, you should act upon your needs.
Here is for a little bit of my own experience: in a company I was working on, we need the database to be synchronised back from the production to user acceptance test and developer environment.
Because of the size of the production environment, importing the data from an SQL file in the entrypoint
of the container took around twenty minutes. This might have been alright for the UAT environment, but, was not for the developer's one.
So after trying all sort of minor improvement in the SQL file (like disabling foreign keys checks and the like), I came up with a totally new approach: I created a big fatty image, in a nightly build that would contain the database already. This is, indeed, against all good practice of Docker, but the bandwidth at the office allowed the container to start in a matter of 5 minutes at worse, compared to the twenty that was before.
So I indeed ended up with a build time of my Docker SQL image being humongous, but a download time acceptable, considering the bandwidth available and a run time reduced to the maximum.
This is taking the advantage of the fact that the build of an image only happens for once, while the start time will happens for all the containers derivating from this image.
To answer your second question:
On a another note, if I want x containers to run, I will make sure there are x pre-existing containers running. This is to make sure that every container has its own specific
librarycontainer
to work with. Would this be okay?
I would say the answer is: no.
Even in a micro services architecture, each service should be able to do something. As I understand it, you actual not-library-container
are unable to do anything because they are tightly coupled to the pre-existence of another container.
This said there are two things that might be of interest to you:
First: remember that you can always build from another pre-existing image, even your own.
Given this would be your library-container
Dockerfile
FROM: openjdk:8-jdk-alpine
ARG JAR_FILE=target/*.jar
COPY ${JAR_FILE} app.jar
ENTRYPOINT ["java","-jar","/app.jar"]
Credits: https://spring.io/guides/topicals/spring-boot-docker/#_a_basic_dockerfile
And that you build it via
docker build -t my/spring-boot .
Then you can have another container build on top of that image:
FROM: my/spring-boot
COPY some-specific-lib lib.jar
Secondly: there is a nice technique in Docker to deal with libraries that is called multi-stage builds and that can be used exactly for your case.
FROM openjdk:8-jdk-alpine as build
WORKDIR /workspace/app
COPY mvnw .
COPY .mvn .mvn
COPY pom.xml .
COPY src src
RUN ./mvnw install -DskipTests
RUN mkdir -p target/dependency && (cd target/dependency; jar -xf ../*.jar)
FROM openjdk:8-jdk-alpine
VOLUME /tmp
ARG DEPENDENCY=/workspace/app/target/dependency
COPY --from=build ${DEPENDENCY}/BOOT-INF/lib /app/lib
COPY --from=build ${DEPENDENCY}/META-INF /app/META-INF
COPY --from=build ${DEPENDENCY}/BOOT-INF/classes /app
ENTRYPOINT ["java","-cp","app:app/lib/*","hello.Application"]
Credits: https://spring.io/guides/topicals/spring-boot-docker/#_multi_stage_build
And as you can see in the credits of this multi-stage build, there is even a reference to this technique in the guide of the Spring website.
The manner you are attempting to reach your goal, defies the entire point of containerisation.
We may cycle back to firmly focus on the goal -- you are aiming to "resolve the problem of cold start" and to "speed up the spring boot application".
Have you considered actually compiling your Java application to a native binary?
The essence of the JVM is to support Java's feature of interoperability in a respective host environment. Since containers by their nature inherently resolves interoperability, another layer of resolution (by the JVM) is absolutely irrelevant.
Native compilation of your application will factor out the JVM from your application runtime, therefore ultimately resolving the cold start issue. GraalVM
is a tool you could use to do native compilation of a Java application. There are GraalVM Container Images to support your development of your application container.
Below is a sample Dockerfile
that demonstrates building a Docker image for a native compiled Java application.
# Dockerfile
FROM oracle/graalvm-ce AS builder
LABEL maintainer="Igwe Kalu <[email protected]>"
COPY HelloWorld.java /app/HelloWorld.java
RUN \
set -euxo pipefail \
&& gu install native-image \
&& cd /app \
&& javac HelloWorld.java \
&& native-image HelloWorld
FROM debian:10.4-slim
COPY --from=builder /app/helloworld /app/helloworld
CMD [ "/app/helloworld" ]
# .dockerignore
**/*
!HelloWorld.java
// HelloWorld.java
public class HelloWorld {
public static void main(String[] args) {
System.out.println("Hello, Native Java World!");
}
}
Build the image and run the container:
# Building...
docker build -t graalvm-demo-debian-v0 .
# Running...
docker run graalvm-demo-debian-v0:latest
## Prints
## Hello, Native Java World!
Spring Tips: The GraalVM Native Image Builder Feature is an article that demos building a Spring Boot application with GraalVM.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With