I have read that there is a significant hit to performance when mounting shared volumes on windows. How does this compared to only having say the postgres DB inside of a docker volume (not shared with host OS) or the rate of reading/writing from/to flat files? Has anyone found any concrete numbers around this? I think even a 4x slowdown would be acceptable for my usecase if it is only for disc IO performance... I get the impression that mounted + shared volumes are significantly slower on windows... so I want to know if foregoing this sharing component help improve matters into an acceptable range. Also if I left Postgres on bare metal can all of my docker apps access Postgres still that way? (That's probably preferred I would imagine - I have seen reports of 4x faster read/write staying bare metal) - but I still need to know... because my apps deal with lots of copy / read / moving of flat files as well... so need to know what is best for that. For example, if shared volumes are really bad vs keeping it only on the container, then I have options to push files over the network to avoid the need for a shared mounted volume as a bottleneck... Thanks for any insights

You only pay this performance cost for bind-mounted host directories. Named Docker volumes or the Docker container filesystem will be much faster. The standard Docker Hub database images are configured to always use a volume for storage, so you should use a named volume for this case. <pre class="prettyprint"><code>docker volume create pgdata docker run -v pgdata:/var/lib/postgresql/data -p 5432:5432 postgres:12 </code></pre> You can also run PostgreSQL directly on the host. On systems using the Docker Desktop application you can access it via the special hostname <code>host.docker.internal</code>. This is discussed at length in From inside of a Docker container, how do I connect to the localhost of the machine?. If you're using the Docker Desktop application, and you're using volumes for: <ul> <li>Opaque database storage, like the PostgreSQL data: use a named volume; it will be faster and you can't usefully directly access the data even if you did have it on the host</li> <li>Injecting individual config files: use a bind mount; these are usually only read once at startup so there's not much of a performance cost</li> <li>Exporting log files: use a bind mount; if there is enough log I/O to be a performance problem you're probably actively debugging</li> <li>Your application source code: don't use a volume at all, run the code that's in the image, or use a native host development environment</li> </ul>

Is read/write performance better with docker volumes on windows (inside of a docker container only) or a mounted / shared volume with host OS?

Tags:

performance

docker

windows

mount

volumes

I have read that there is a significant hit to performance when mounting shared volumes on windows. How does this compared to only having say the postgres DB inside of a docker volume (not shared with host OS) or the rate of reading/writing from/to flat files?

Has anyone found any concrete numbers around this? I think even a 4x slowdown would be acceptable for my usecase if it is only for disc IO performance... I get the impression that mounted + shared volumes are significantly slower on windows... so I want to know if foregoing this sharing component help improve matters into an acceptable range.

Also if I left Postgres on bare metal can all of my docker apps access Postgres still that way? (That's probably preferred I would imagine - I have seen reports of 4x faster read/write staying bare metal) - but I still need to know... because my apps deal with lots of copy / read / moving of flat files as well... so need to know what is best for that.

For example, if shared volumes are really bad vs keeping it only on the container, then I have options to push files over the network to avoid the need for a shared mounted volume as a bottleneck...

Thanks for any insights

523

asked Jun 21 '20 01:06

AustEcon

1 Answers

You only pay this performance cost for bind-mounted host directories. Named Docker volumes or the Docker container filesystem will be much faster. The standard Docker Hub database images are configured to always use a volume for storage, so you should use a named volume for this case.

docker volume create pgdata
docker run -v pgdata:/var/lib/postgresql/data -p 5432:5432 postgres:12

You can also run PostgreSQL directly on the host. On systems using the Docker Desktop application you can access it via the special hostname host.docker.internal. This is discussed at length in From inside of a Docker container, how do I connect to the localhost of the machine?.

If you're using the Docker Desktop application, and you're using volumes for:

Opaque database storage, like the PostgreSQL data: use a named volume; it will be faster and you can't usefully directly access the data even if you did have it on the host
Injecting individual config files: use a bind mount; these are usually only read once at startup so there's not much of a performance cost
Exporting log files: use a bind mount; if there is enough log I/O to be a performance problem you're probably actively debugging
Your application source code: don't use a volume at all, run the code that's in the image, or use a native host development environment

105

answered Nov 03 '22 00:11

David Maze

Related questions
                            
                                How to build a qt project from command line?
                            
                                Keyboard event not sent to window with pywin32
                            
                                Get list of running windows applications using python
                            
                                Running two python scripts with bash file
                            
                                `more.com` returns "Not enough memory."
                            
                                How to check if Python venv is active in Windows?
                            
                                Why does `findstr` with variable expansion in its search string return unexpected results when involved in a pipe?
                            
                                Turn on echo for a single command in a batch file
                            
                                PyWin32 (226) and virtual environments
                            
                                Fatal Python error: failed to get random numbers to initialize Python
                            
                                VSCode on Windows: gdb doesn't break on 'throw' but breaks on regular exceptions
                            
                                JNLP Connections are deprecated in Jenkins what is the new recommended way connecting a windows slave to jenkins?
                            
                                How to change the voice used for SAPI.SPVoice
                            
                                React Native does not support development on Windows (yet)?
                            
                                How to print current time (with milliseconds) using C++ / C++11
                            
                                Using DPAPI with Python?
                            
                                Install Python Fabric on Windows [closed]
                            
                                Windows 8, C++ and Metro GUI samples?
                            
                                Jenkins Error cloning remote repo 'origin', slave node
                            
                                How can clear screen in php cli (like cls command) [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With