Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Docker Ruby 2.6.6-alpine is twice as big as Ruby-2.6.5-alpine

Docker image using ruby-2.6.6-alpine creates a 498mb image size. However, when I downgrade to ruby-2.6.5-alpine, it goes back to 266mb. The 2.6.6 release is almost twice as big. Why is this so?

# RESULTS IN 266MB
FROM ruby:2.6.5-alpine
RUN apk add --update --no-cache \
        build-base \
        postgresql-dev \
        vim \
        tzdata \
        bash \
        less

# RESULTS IN 498MB
FROM ruby:2.6.6-alpine
RUN apk add --update --no-cache \
        build-base \
        postgresql-dev \
        vim \
        tzdata \
        bash \
        less
like image 737
Xtrfox Avatar asked Jul 25 '20 07:07

Xtrfox


People also ask

Is Alpine Linux good for Docker?

Alpine Linux is a super lightweight Linux distribution that's useful for Docker containers.

What is Alpine version in Docker?

What is Alpine Linux? Alpine Linux is a Linux distribution built around musl libc and BusyBox. The image is only 5 MB in size and has access to a package repository that is much more complete than other BusyBox based images. This makes Alpine Linux a great image base for utilities and even production applications.

Does Alpine image have Docker?

With its container-friendly design, the Alpine Docker Official Image (DOI) helps developers build and deploy lightweight, cross-platform applications. It's based on Alpine Linux which debuted in 2005, making it one of today's newest major Linux distros.


1 Answers

TLDR; replace postgresql-dev with 'postgresql-dev<12.2-r0'

As in:

RUN apk add --update --no-cache \
...\
'postgresql-dev<12.2-r0'\
...

You will get (with ruby:2.6.6-alpine as base image) a 305Mo image size, close enough of the 265Mo you have when the base image is ruby:2.6.5-alpine.


Details:

I don't seem to make imagelayers.io work properly, so I fall back to wagoodman/dive to inspect layers of those two images.

In both case, I get images of 50Mo (49.5 for 2.6.5-alpine)

base image

So the base image is not a problem here.

The difference is:

Your dependency list will install, when done from 2.6.6, the following additional packages:

{+Installing xz-libs (5.2.5-r0)+}
{+Installing libxml2 (2.9.10-r4)+}
{+Installing llvm10-libs (10.0.0-r2)+}
{+Installing clang-libs (10.0.0-r2)+}
{+Installing clang (10.0.0-r2)+}
{+Installing llvm10 (10.0.0-r2)+}
{+Installing icu-libs (67.1-r0)+}
{+Installing icu (67.1-r0)+}
{+Installing icu-dev (67.1-r0)+}

That is due to the difference between:

  • docker ruby 2.6.6 alpine based on alpine:3.12
  • docker ruby 2.6.5 alpine based on alpine 3.11 (or 3.9)

I added this gist to your Dockerfile

RUN apk info | xargs -n1 -I{} apk info -s {} | xargs -n4 | awk '{print $4,$1}' | sort -rn

I see

85360640 gcc-9.2.0-r4

vs.

109056000 gcc-9.3.0-r2
63430656 llvm10-libs-10.0.0-r2
62976000 clang-libs-10.0.0-r2
  • gcc is bigger (from 85Mo to 109)
  • llvm10-libs and clang-libs adds 125Mo

As illustrated in "The Quest for Minimal Docker Images, part 2" from Jérôme Petazzoni, it is best to use a multi-stage build like:

FROM alpine
RUN apk add build-base
COPY hello.c .
RUN gcc -o hello hello.c

FROM alpine
COPY --from=0 hello .
CMD ["./hello"]

Notes, looking for the size increase root cause:

  • build_base for Alpine 3.9 has 7 dependencies
  • build_base for Alpine 3.11 has 7 dependencies
  • build_base for Alpine 3.12 has 8.

The additional dependency is patch, which itself depends on musl: very small.
The size issue is not related to build-base

Let's change your Dockerfile to:

RUN apk add --update
RUN apk info build-base
RUN apk add --update --no-cache build-base
RUN apk add --update --no-cache postgresql-dev
RUN apk add --update --no-cache vim
RUN apk add --update --no-cache tzdata
RUN apk add --update --no-cache bash
RUN apk add --update --no-cache less

The utility dive will show postgresql-dev is the one needing clang with Alpine 3.12.

  • postgresql-dev for Alpine 3.11 has 5 dependencies
  • postgresql-dev for Alpine 3.12 has 8

Those three additional dependencies are:

  • clang (25Mo)
  • icu-dev
  • llvm10 (which itself depends on llvm10-libs, 60Mo)

From Alpine 3.10+ to 3.12, installing postgresql means embarking clang, but the difference in size for postgresql-dev is massive between the two.

  • Alpine 3.11 or lower: 15Mo for postgresql-dev

Alpine 3.11 postgresq-dev

  • Alpine 3.12: 215Mo for postgresql-dev

Alpine 3.12 postgresql-dev

That is because of a recent change in dependencies for the Alpine postgresql APKBUILD.
See:

  • commit f040d3a from Jakub Jirutka adding clang and llvm.
  • commit 2bf48a5 from Ariadne Conill adding icu-dev.

The comment of the first commit mentions:

Since we build PostgreSQL with JIT support enabled, clang and llvm-lto are required for building extensions with PGXS.

Probably because of docker-library/postgres issue 475: "JIT --with-llvm"

... which leads to docker-library/postgres issue 651: "postgres:12.0-alpine upgrade to postgres:12.1-alpine double size".

See "Postgresql 12: What Is JIT compilation?"

Just-in-Time (JIT) compilation is the process of turning some form of interpreted program evaluation into a native program, and doing so at run time.

For example, instead of using general-purpose code that can evaluate arbitrary SQL expressions to evaluate a particular SQL predicate like WHERE a.col = 3, it is possible to generate a function that is specific to that expression and can be natively executed by the CPU, yielding a speedup.

PostgreSQL has builtin support to perform JIT compilation using LLVM when PostgreSQL is built with --with-llvm.

All that because now:

For PostgreSQL 12 systems that support LLVM, just-in-time compilation, aka "JIT," is enabled by default.

This is reflected in how postgresql is built for Docker: see commit c8bf23b from issue 643, and merged in docker-library/official-images PR 7042.

So... possible workaround: limit the version of postgresql-dev, as explained in "How to install a specific package version in Alpine?":

RUN apk add --update --no-cache 'postgresql-dev<12.2-r0'

That will make sure to use the 15Mo postgresql instead of the 215Mo one.

like image 108
VonC Avatar answered Nov 03 '22 10:11

VonC