Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is read/write performance better with docker volumes on windows (inside of a docker container only) or a mounted / shared volume with host OS?

I have read that there is a significant hit to performance when mounting shared volumes on windows. How does this compared to only having say the postgres DB inside of a docker volume (not shared with host OS) or the rate of reading/writing from/to flat files?

Has anyone found any concrete numbers around this? I think even a 4x slowdown would be acceptable for my usecase if it is only for disc IO performance... I get the impression that mounted + shared volumes are significantly slower on windows... so I want to know if foregoing this sharing component help improve matters into an acceptable range.

Also if I left Postgres on bare metal can all of my docker apps access Postgres still that way? (That's probably preferred I would imagine - I have seen reports of 4x faster read/write staying bare metal) - but I still need to know... because my apps deal with lots of copy / read / moving of flat files as well... so need to know what is best for that.

For example, if shared volumes are really bad vs keeping it only on the container, then I have options to push files over the network to avoid the need for a shared mounted volume as a bottleneck...

Thanks for any insights

like image 523
AustEcon Avatar asked Jun 21 '20 01:06

AustEcon


People also ask

Where should I store Docker volumes?

Volumes are stored in a part of the host filesystem which is managed by Docker ( /var/lib/docker/volumes/ on Linux). Non-Docker processes should not modify this part of the filesystem. Volumes are the best way to persist data in Docker. Bind mounts may be stored anywhere on the host system.

Why is Docker on Windows so slow?

Why is Docker so slow? The root of the issue is that Windows 10 is (was) using WSL (Windows Subsystem for Linux), which is a layer between Windows and Linux. Communication between these two (Hard Drive operations) can be quite slow.

What is the reason for using volumes in Docker?

Docker volumes are a widely used and useful tool for ensuring data persistence while working in containers. Docker volumes are file systems mounted on Docker containers to preserve data generated by the running container.

Why would you choose data volume containers over data volumes?

The main advantage of both volumes and the data container pattern is that bind mounting on a host is host dependent, meaning you couldn't use that in a docker file. Volumes allow you the flexibility to define volumes when you build your images.


1 Answers

You only pay this performance cost for bind-mounted host directories. Named Docker volumes or the Docker container filesystem will be much faster. The standard Docker Hub database images are configured to always use a volume for storage, so you should use a named volume for this case.

docker volume create pgdata
docker run -v pgdata:/var/lib/postgresql/data -p 5432:5432 postgres:12

You can also run PostgreSQL directly on the host. On systems using the Docker Desktop application you can access it via the special hostname host.docker.internal. This is discussed at length in From inside of a Docker container, how do I connect to the localhost of the machine?.

If you're using the Docker Desktop application, and you're using volumes for:

  • Opaque database storage, like the PostgreSQL data: use a named volume; it will be faster and you can't usefully directly access the data even if you did have it on the host
  • Injecting individual config files: use a bind mount; these are usually only read once at startup so there's not much of a performance cost
  • Exporting log files: use a bind mount; if there is enough log I/O to be a performance problem you're probably actively debugging
  • Your application source code: don't use a volume at all, run the code that's in the image, or use a native host development environment
like image 105
David Maze Avatar answered Nov 03 '22 00:11

David Maze