Find the source code for computing size of a docker image

Tags:

I have heard the number is not equal to all sizes of layers adding together inside an image. And it is also not the size of disk space it occupies.

Now I want to check the logic by source code (in this repo: https://github.com/docker/docker-ce), because seeing is believing! But after navigating the code for a lot of time, I found that I was not able to find the real imag-size-computing code.

So which function/file is the docker used to perform the size logic?

514

asked Apr 23 '20 15:04

Wallace

1 Answers

Before digging too deep, you may find it useful to understand how Linux implements the overlay filesystem. I include a bit on this the first exercise of my intro presentation's build section. The demo notes include each of the commands I'm running and it gives you an idea of how layers are merged, and what happens when you add/modify/delete from a layer.

This is implementation dependent, based on your host OS and the graph driver being used. I'm taking the example of a Linux OS and Overlay2 since that's the most common use case.

It starts by looking at the image layer storage size:

Click to copy

// GetContainerLayerSize returns the real size & virtual size of the container.
func (i *ImageService) GetContainerLayerSize(containerID string) (int64, int64) {
    var (
        sizeRw, sizeRootfs int64
        err                error
    )

    // Safe to index by runtime.GOOS as Unix hosts don't support multiple
    // container operating systems.
    rwlayer, err := i.layerStores[runtime.GOOS].GetRWLayer(containerID)
    if err != nil {
        logrus.Errorf("Failed to compute size of container rootfs %v: %v", containerID, err)
        return sizeRw, sizeRootfs
    }
    defer i.layerStores[runtime.GOOS].ReleaseRWLayer(rwlayer)

    sizeRw, err = rwlayer.Size()
    if err != nil {
        logrus.Errorf("Driver %s couldn't return diff size of container %s: %s",
            i.layerStores[runtime.GOOS].DriverName(), containerID, err)
        // FIXME: GetSize should return an error. Not changing it now in case
        // there is a side-effect.
        sizeRw = -1
    }

    if parent := rwlayer.Parent(); parent != nil {
        sizeRootfs, err = parent.Size()
        if err != nil {
            sizeRootfs = -1
        } else if sizeRw != -1 {
            sizeRootfs += sizeRw
        }
    }
    return sizeRw, sizeRootfs
}

In there is a call to layerStores which itself is a mapping to layer.Store:

Click to copy

// ImageServiceConfig is the configuration used to create a new ImageService
type ImageServiceConfig struct {
    ContainerStore            containerStore
    DistributionMetadataStore metadata.Store
    EventsService             *daemonevents.Events
    ImageStore                image.Store
    LayerStores               map[string]layer.Store
    MaxConcurrentDownloads    int
    MaxConcurrentUploads      int
    MaxDownloadAttempts       int
    ReferenceStore            dockerreference.Store
    RegistryService           registry.Service
    TrustKey                  libtrust.PrivateKey
}

Digging into the layer.Store implementation for GetRWLayer, there is the following definition:

Click to copy

func (ls *layerStore) GetRWLayer(id string) (RWLayer, error) {
    ls.locker.Lock(id)
    defer ls.locker.Unlock(id)

    ls.mountL.Lock()
    mount := ls.mounts[id]
    ls.mountL.Unlock()
    if mount == nil {
        return nil, ErrMountDoesNotExist
    }

    return mount.getReference(), nil
}

Following that to find the Size implementation for the mount reference, there is this function that gets into the specific graph driver:

Click to copy

func (ml *mountedLayer) Size() (int64, error) {
    return ml.layerStore.driver.DiffSize(ml.mountID, ml.cacheParent())
}

Looking at the overlay2 graph driver to find the DiffSize function:

Click to copy

func (d *Driver) DiffSize(id, parent string) (size int64, err error) {
    if useNaiveDiff(d.home) || !d.isParent(id, parent) {
        return d.naiveDiff.DiffSize(id, parent)
    }
    return directory.Size(context.TODO(), d.getDiffPath(id))
}

That is calling naiveDiff which implements Size in the graphDriver package:

Click to copy

func (gdw *NaiveDiffDriver) DiffSize(id, parent string) (size int64, err error) {
    driver := gdw.ProtoDriver

    changes, err := gdw.Changes(id, parent)
    if err != nil {
        return
    }

    layerFs, err := driver.Get(id, "")
    if err != nil {
        return
    }
    defer driver.Put(id)

    return archive.ChangesSize(layerFs.Path(), changes), nil
}

Following archive.ChangeSize we can see this implementation:

Click to copy

// ChangesSize calculates the size in bytes of the provided changes, based on newDir.
func ChangesSize(newDir string, changes []Change) int64 {
    var (
        size int64
        sf   = make(map[uint64]struct{})
    )
    for _, change := range changes {
        if change.Kind == ChangeModify || change.Kind == ChangeAdd {
            file := filepath.Join(newDir, change.Path)
            fileInfo, err := os.Lstat(file)
            if err != nil {
                logrus.Errorf("Can not stat %q: %s", file, err)
                continue
            }

            if fileInfo != nil && !fileInfo.IsDir() {
                if hasHardlinks(fileInfo) {
                    inode := getIno(fileInfo)
                    if _, ok := sf[inode]; !ok {
                        size += fileInfo.Size()
                        sf[inode] = struct{}{}
                    }
                } else {
                    size += fileInfo.Size()
                }
            }
        }
    }
    return size
}

At which point we are using os.Lstat to return a struct that includes Size on each entry that is an add or modify to each directory. Note that this is one of several possible paths the code takes, but I believe it's one of the more common ones for this scenario.

117

answered Oct 21 '22 17:10

BMitch

Related questions
                            
                                Visual Studio 2017 stopped running docker-compose
                            
                                Docker PHP and FreeTDS -cannot find freetds in know installation directories
                            
                                dockerize a wpf application and use it
                            
                                docker ps Invalid bind address format:
                            
                                How to install Docker on Ubuntu 17.10 Artful Aardvark
                            
                                docker run throws "invalid reference format: repository name must be lowercase" using $(pwd) in volume flag
                            
                                How to delete image from previous stage on Docker multi-stage build
                            
                                java.net.UnknownHostException dockerized mysql from spring boot application
                            
                                traefik permissions 777 for acme.json are too open, please use 600
                            
                                (using WSL ubuntu app) system has not been booted with system as init system (PID 1). Can't operate
                            
                                Why does docker report "no such file or directory": unknown?
                            
                                Backup and restore docker named volume
                            
                                Share a FUSE FS mounted inside a docker container through volumes
                            
                                Docker exec linux terminal create alias
                            
                                Change default timezone in ASP.NET Core 2.2 on Docker for 24h time format
                            
                                Exception "error MSB3024: Could not copy the file..." is thrown when attempting to build in DevOps pipeline using .Net Core 3.0 SDK (preview5)
                            
                                Can a Helm Install create a container from a dockerfile?
                            
                                localhost in build_absolute_uri for Django with Nginx
                            
                                Unable to start postgres docker container from docker-compose
                            
                                Changing shared memory size in docker compose

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Find the source code for computing size of a docker image

Tags:

docker

go

docker-ce

Wallace

People also ask

1 Answers

BMitch

Recent Activity

Donate For Us