Is it possible to have ephemeral, on-disk pod storage on Google Kubernetes Engine?

Tags:

I would like the containers in my pod to share a volume for temporary (cached) data. I don't mind if the data is lost when the pod terminates (in fact, I want the data deleted and space reclaimed).

The kubernetes docs make an emptyDir sound like what I want:

An emptyDir volume is first created when a Pod is assigned to a Node, and exists as long as that Pod is running on that node

.. and

By default, emptyDir volumes are stored on whatever medium is backing the node - that might be disk or SSD or network storage, depending on your environment. However, you can set the emptyDir.medium field to "Memory" to tell Kubernetes to mount a tmpfs (RAM-backed filesystem) for you instead

That sounds like the default behaviour is to store the volume on disk, unless I explicitly request in-memory.

However, if I create the following pod on my GKE cluster:

apiVersion: v1
kind: Pod
metadata:
  name: alpine
spec:
  containers:
  - name: alpine
    image: alpine:3.7
    command: ["/bin/sh", "-c", "sleep 60m"]
    volumeMounts:
      - name: foo
        mountPath: /foo
  volumes:
  - name: foo
    emptyDir: {}

.. and then open a shell on the pod and write a 2Gb file to the volume:

kubectl exec -it alpine -- /bin/sh
$ cd foo/
$ dd if=/dev/zero of=file.txt count=2048 bs=1048576

Then I can see in the GKE web console that the RAM usage of the container has increased by 2Gb:

memory increase in the alpine contianer

It looks to me like the GKE stores emptyDir volumes in memory by default. The workload I plan to run needs plenty of memory, so I'd like the emptyDir volume to be backed by disk - is that possible? The GKE storage docs don't have much to say on the issue.

An alternative approach might be to use a local SSD for my cached data, however if I mount them as recommended in the GKE docs they're shared by all pods running on the same node and the data isn't cleaned up on pod termination, which doesn't meet my goals of automatically managed resources.

Mounts

Here's the output of df -h inside the container:

# df -h
Filesystem                Size      Used Available Use% Mounted on
overlay                  96.9G     26.2G     70.7G  27% /
overlay                  96.9G     26.2G     70.7G  27% /
tmpfs                     7.3G         0      7.3G   0% /dev
tmpfs                     7.3G         0      7.3G   0% /sys/fs/cgroup
/dev/sda1                96.9G     26.2G     70.7G  27% /foo
/dev/sda1                96.9G     26.2G     70.7G  27% /dev/termination-log
/dev/sda1                96.9G     26.2G     70.7G  27% /etc/resolv.conf
/dev/sda1                96.9G     26.2G     70.7G  27% /etc/hostname
/dev/sda1                96.9G     26.2G     70.7G  27% /etc/hosts
shm                      64.0M         0     64.0M   0% /dev/shm
tmpfs                     7.3G     12.0K      7.3G   0% /run/secrets/kubernetes.io/serviceaccount
tmpfs                     7.3G         0      7.3G   0% /proc/kcore
tmpfs                     7.3G         0      7.3G   0% /proc/timer_list
tmpfs                     7.3G         0      7.3G   0% /proc/sched_debug
tmpfs                     7.3G         0      7.3G   0% /sys/firmware

The View from the Node

I discovered it's possible to ssh into the node instance, and I was able to find the 2Gb file on the node filesystem:

root@gke-cluster-foo-pool-b-22bb9925-xs5p:/# find . -name file.txt
./var/lib/kubelet/pods/79ad1aa4-4441-11e8-af32-42010a980039/volumes/kubernetes.io~empty-dir/foo/file.txt

Now that I can see it is being written to the underlying filesystem, I'm wondering if maybe the RAM usage I'm seeing in the GKE web UI is the linux filesystem cache or similar, rather than the file being stored in a RAM disk?

573

asked Apr 20 '18 00:04

James Healy

1 Answers

From the mount information you've supplied, the emptyDir volume is mounted on a drive partition, so it's working as intended, and isn't mounted in memory. It's likely that the memory usage you see is due to the filesystem buffer cache, so with sufficient memory pressure, it'd eventually get written to the disk. However, given that you have so much free memory, it's likely that the system saw no need to do so immediately.

If you have more doubts, give sync or echo 3 > /proc/sys/vm/drop_caches a go on the machines to flush filesystem information to disk. You should see a change in memory usage.

101

answered Oct 04 '22 20:10

hexacyanide

Related questions
                            
                                Read JSON-file into environment variable with Docker Compose
                            
                                Cache "go get" in docker build
                            
                                How to mount windows folder using docker compose volumes?
                            
                                Add private key to ssh-agent in docker file
                            
                                How to read external secrets when using docker-compose
                            
                                KubeDNS error, server misbehaving
                            
                                how do you enable ssl using laravel 8 sail
                            
                                Pull private docker images from Google Container Registry w/o gcloud
                            
                                Get docker run command for container
                            
                                Docker build with -f option cannot find Dockerfile
                            
                                Docker toolbox: Is there a way to mount other folders than from "C:\Users" Windows?
                            
                                How to get 'man' working in an Alpine Linux Docker container?
                            
                                How I can group containers in docker-compose?
                            
                                Docker for Windows 10 //./pipe/docker_engine: access is denied
                            
                                Azure Docker Container - how to pass startup commands to a docker run?
                            
                                Any reasons to not use Docker Swarm (instead of Docker-Compose) on a single node?
                            
                                manage.py doesn't log to stdout/stderr in Docker on Raspberry Pi
                            
                                Restrict internet access to docker container?
                            
                                docker ERROR: unknown blob
                            
                                How to best add extensions when using official docker image for MediaWiki?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Is it possible to have ephemeral, on-disk pod storage on Google Kubernetes Engine?

Tags:

docker

google-cloud-platform

kubernetes

google-kubernetes-engine