This question has been bugging me a bit because I want to follow best practices and by not knowing what each command is ment for I might make a architectural mistake. So, can anyone explain to me when to use a RUN command in a Dockerfile vs the command directive in a docker-compose.yml file?
As I understand it a Dockerfile should contain instructions to build your image so by that logic commands which for instance migrate a database should be in here. A docker-compose.yml file contains instructions to build the environment. What I'm confused about though is why there seem to be options to put instructions for running commands in both files.
The Dockerfile is a receipt on how to build a new image with e.g. docker build, while docker-compose is used to orchestrate starting (multiple) containers.
The RUN directive in an Dockerfile is executed during the build phase of an image and its result get committed to the image. The command attribute in a docker-compose.yml corresponds to the CMD directive in a Dockerfile and to the optional command parameter of docker run and specifies the command to be executed when starting a new container based on the given image.
See also: Difference between RUN and CMD in a Dockerfile
Imagine you wanted to send a copy of your application to a colleague.  They don't need the entire source tree, since they're just trying to run the application.  You can imagine running some commands to build and install the application, creating a tar file of the result, and sending that tar file to them.  Those "build and install" commands should be Dockerfile RUN commands.
I'm confused about [...] why there seem to be options to put instructions for running commands in both [Dockerfile and
docker-compose.yml].
By way of a very typical example, imagine you have a Python Django application that also happens to be able to run some tasks in the background using the Celery framework. Both the main Web server and the background worker have the exact same source code; it's just a question of what command you're running when you launch the container.
In this case, your Dockerfile would declare some useful default CMD, say to run the Django application:
FROM python:3.10
...
CMD ./manage.py runserver 0.0.0.0:8000
In your docker-compose.yml file you can run this image twice, but for the second one, you'd override the command:.
version: '3.8'
services:
  app:
    build: .
    ports: ['8000:8000']
    # using the default Dockerfile CMD
  worker:
    build: .
    command: celery -A proj worker
Particularly if there's only a single command your application wants to run, prefer declaring it in the Dockerfile.  I'd guess 90%+ of cases don't need a Compose command:.  There's no reason to repeat an identical command in the Compose file.
I'd avoid writing an extended multi-line command in either CMD or command:.  Instead, write a shell script that does all of the things you need to do, COPY the script into the image, and set that script as your CMD.  (There is also a pattern I like of using an ENTRYPOINT to do setup tasks and ending with exec "$@" to run the CMD.)
... a Dockerfile should contain [...] commands which for instance migrate a database ...
This actually doesn't work. Remember the hypothetical I started the question with: if part of the build instructions run database migrations, and then you send just a tar file of the filesystem to your colleague, their database won't have run the migration. For a couple of reasons, if your database is also in a container, the image build sequence won't be able to contact the database (if it's running at all).
There's much more discussion on this topic in, for example, How do you perform Django database migrations when using Docker-Compose?.  Note that the answers there have multi-line Compose command:; as discussed above, I'd probably write this as a script and make the script be the image's CMD instead.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With