Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Discrepancy between two hosts running the same docker commands

A colleague and I have a big Docker puzzle.

When we run the following commands we get different results.

docker run -it python:3.8.6 /bin/bash
pip install fbprophet

For me, it installs perfectly, while for him it produces an error and fails to install. I thought the whole point of docker is to prevent this kind of issue, so I'm really puzzled.

I'm giving more details below, but my main question is:

  • How is it possible that we get different results?

More details:

We both are running Docker in a new MacBook Pro with similar specs, on Catalina. His Docker engine version 20.x.x is slightly newer than mine 19.X.X. Also:

  • He tried all the commands he could think of to clean up things in Docker.
  • We verified that the hashes of the image IDs were the same.
  • Our resource settings were also the same.
  • He tried reinstalling Docker and changing to other versions of python (3.7).
  • We tried simultaneously on multiple occasions during the last three days.

The result was always the same: He gets the error and I don't.

The error he gets is the following.

Error:
Installing collected packages: six, pytz, python-dateutil, pymeeus, numpy, pyparsing, pillow, pandas, korean-lunar-calendar, kiwisolver, ephem, Cython, cycler, convertdate, tqdm, setuptools-git, pystan, matplotlib, LunarCalendar, holidays, cmdstanpy, fbprophet
    Running setup.py install for fbprophet ... error
    ERROR: Command errored out with exit status 1:
     command: /usr/local/bin/python -u -c ‘import sys, setuptools, tokenize; sys.argv[0] = ‘“’”‘/tmp/pip-install-l516b8ts/fbprophet_80d5f400081541a2bf6ee26d2785e363/setup.py’“‘”’; __file__=‘“’”‘/tmp/pip-install-l516b8ts/fbprophet_80d5f400081541a2bf6ee26d2785e363/setup.py’“‘”’;f=getattr(tokenize, ‘“’”‘open’“‘”’, open)(__file__);code=f.read().replace(‘“’”‘\r\n’“‘”’, ‘“’”‘\n’“‘”’);f.close();exec(compile(code, __file__, ‘“’”‘exec’“‘”’))' install --record /tmp/pip-record-7n8tvfkb/install-record.txt --single-version-externally-managed --compile --install-headers /usr/local/include/python3.8/fbprophet
         cwd: /tmp/pip-install-l516b8ts/fbprophet_80d5f400081541a2bf6ee26d2785e363/
    Complete output (10 lines):
    running install
    running build
    running build_py
    creating build
    creating build/lib
    creating build/lib/fbprophet
    creating build/lib/fbprophet/stan_model
    Importing plotly failed. Interactive plots will not work.
    INFO:pystan:COMPILING THE C++ CODE FOR MODEL anon_model_dfdaf2b8ece8a02eb11f050ec701c0ec NOW.
    error: command ‘gcc’ failed with exit status 1
    ----------------------------------------
ERROR: Command errored out with exit status 1: /usr/local/bin/python -u -c ‘import sys, setuptools, tokenize; sys.argv[0] = ‘“’”‘/tmp/pip-install-l516b8ts/fbprophet_80d5f400081541a2bf6ee26d2785e363/setup.py’“‘”’; __file__=‘“’”‘/tmp/pip-install-l516b8ts/fbprophet_80d5f400081541a2bf6ee26d2785e363/setup.py’“‘”’;f=getattr(tokenize, ‘“’”‘open’“‘”’, open)(__file__);code=f.read().replace(‘“’”‘\r\n’“‘”’, ‘“’”‘\n’“‘”’);f.close();exec(compile(code, __file__, ‘“’”‘exec’“‘”’))' install --record /tmp/pip-record-7n8tvfkb/install-record.txt --single-version-externally-managed --compile --install-headers /usr/local/include/python3.8/fbprophet Check the logs for full command output.

Note that running the two commands I provided always produce errors, but they are not critical. Upgrading setuptools and installing the dependencies before fbprophet solves those minor errors. The error shown above is different, related to gcc, and only happens to some people.

Optional additional questions:

  • How do we fix it?
  • How do we prevent non-reproducible results like this one?
  • Can upgrading the docker engine version break a container?
like image 406
German Capuano Avatar asked Dec 19 '20 03:12

German Capuano


People also ask

What is the difference between docker commands up run and start?

The run command acts like docker run -ti in that it opens an interactive terminal to the container and returns an exit status matching the exit status of the process in the container. The docker compose start command is useful only to restart containers that were previously created, but were stopped.

How do I run a docker container in detached mode?

Run in detached mode Docker can run your container in detached mode or in the background. To do this, we can use the --detach or -d for short. Docker will start your container the same as before but this time will “detach” from the container and return you to the terminal prompt.

What is the difference between Docker compose and docker run?

The key difference between docker run versus docker-compose is that docker run is entirely command line based, while docker-compose reads configuration data from a YAML file. The second major difference is that docker run can only start one container at a time, while docker-compose will configure and run multiple.

How do you check the docker is running or not?

The operating-system independent way to check whether Docker is running is to ask Docker, using the docker info command. You can also use operating system utilities, such as sudo systemctl is-active docker or sudo status docker or sudo service docker status , or checking the service status using Windows utilities.


1 Answers

How do we fix it?

Your error reports a GCC / compilation problem.
A quick search shows mostly problems related to python / gcc version (one, two, three).
But you are right, this doesn't look like as it could happen inside a one particular container.

What it does look like is some kind of OOM problem.

Also, is this a VM? Stan requires a significant amount of memory to compile the models, and this error can occur if you run out of RAM while it is compiling.

I did a bit of testing.
On my machine the compilation process consumed up to 2.4 Gb of RAM.

cat /etc/redhat-release
CentOS Linux release 7.9.2009 (Core)

uname -r
3.10.0-1160.6.1.el7.x86_64

docker --version
Docker version 20.10.1, build 831ebea

# works fine
docker run --rm -it -m 3G python:3.8.6 /bin/bash

# fails with error: command 'gcc' failed with exit status 1
# actually it was killed by OOM killer
docker run --rm -it -m 2G python:3.8.6 /bin/bash

# yes, here he is
tail -f /var/log/messages | grep -i 'killed process'
Dec 22 08:34:09 cent7-1 kernel: Killed process 5631 (cc1plus), UID 0, total-vm:2073600kB, anon-rss:1962404kB, file-rss:15332kB, shmem-rss:0kB
Dec 22 08:35:56 cent7-1 kernel: Killed process 5640 (cc1plus), UID 0, total-vm:2056816kB, anon-rss:1947392kB, file-rss:15308kB, shmem-rss:0kB

Check OOM killer log on problematic machine.
Is there enough RAM available for Docker?


Can upgrading the docker engine version break a container?

Generally, it shouldn't be the case.
But for v20.10.0 Docker introduced a very big set of changes related to memory and cgroups.

After you rule out all obvious reasons (like your friend's machine just not having enough RAM), you might need to dig into your docker daemon settings related to memory / cgroups / etc.


How can the same container produce different results on two computers?

Well, technically it's quite possible.
Containerized programs still use host OS kernel.
Not all kernel settings are "namespaced", i. e. can be set exclusively for one particular container.
A lot of them (actually, most) are still global and can affect your program's behavior.

Though I don't think it's related to your problem.
But for complicated programs relying on specific kernel setting that must be taken into account.

like image 88
Olesya Bolobova Avatar answered Nov 10 '22 08:11

Olesya Bolobova