I am trying to create a Docker image. The Dockerfile is the following:
# Use the official Python 3.6.5 image
FROM python:3.6.5-alpine3.7
# Set the working directory to /app
WORKDIR /app
# Get the
COPY requirements.txt /app
RUN pip3 install --no-cache-dir -r requirements.txt
# Configuring access to Jupyter
RUN mkdir /notebooks
RUN jupyter notebook --no-browser --ip 0.0.0.0 --port 8888 /notebooks
The requirements.txt file is:
jupyter
numpy==1.14.3
pandas==0.23.0rc2
scipy==1.0.1
scikit-learn==0.19.1
pillow==5.1.1
matplotlib==2.2.2
seaborn==0.8.1
Running the command docker build -t standard .
gives me an error when docker it trying to install pandas.
The error is the following:
Collecting pandas==0.23.0rc2 (from -r requirements.txt (line 3))
Downloading https://files.pythonhosted.org/packages/46/5c/a883712dad8484ef907a2f42992b122acf2bcecbb5c2aa751d1033908502/pandas-0.23.0rc2.tar.gz (12.5MB)
Complete output from command python setup.py egg_info:
/bin/sh: svnversion: not found
/bin/sh: svnversion: not found
non-existing path in 'numpy/distutils': 'site.cfg'
Could not locate executable gfortran
... (loads of other stuff)
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-install-xb6f6a5o/pandas/
The command '/bin/sh -c pip3 install --no-cache-dir -r requirements.txt' returned a non-zero code: 1
When I try to install a lower version of pandas==0.22.0, I get this error:
Step 5/7 : RUN pip3 install --no-cache-dir -r requirements.txt
---> Running in 5810ea896689
Collecting jupyter (from -r requirements.txt (line 1))
Downloading https://files.pythonhosted.org/packages/83/df/0f5dd132200728a86190397e1ea87cd76244e42d39ec5e88efd25b2abd7e/jupyter-1.0.0-py2.py3-none-any.whl
Collecting numpy==1.14.3 (from -r requirements.txt (line 2))
Downloading https://files.pythonhosted.org/packages/b0/2b/497c2bb7c660b2606d4a96e2035e92554429e139c6c71cdff67af66b58d2/numpy-1.14.3.zip (4.9MB)
Collecting pandas==0.22.0 (from -r requirements.txt (line 3))
Downloading https://files.pythonhosted.org/packages/08/01/803834bc8a4e708aedebb133095a88a4dad9f45bbaf5ad777d2bea543c7e/pandas-0.22.0.tar.gz (11.3MB)
Could not find a version that satisfies the requirement Cython (from versions: )
No matching distribution found for Cython
The command '/bin/sh -c pip3 install --no-cache-dir -r requirements.txt' returned a non-zero code: 1
I also tried to install Cyphon and setuptools before pandas, but it gave the same No matching distribution found for Cython
error at the pip3 install pandas line.
How could I get pandas installed.
I realize this question has been answered, but I have recently had a similar issue with numpy and pandas dependancies with a dockerized project. That being said, I hope that this will be of benefit to someone in the future.
My solution:
As pointed out by Aviv Sela, Alpine does not contain build tools by default and will need to be added though the Dockerfile. Thus see below my Dockerfile with the build packages required for numpy and pandas for be successfully installed on Alpine for the container.
FROM python:3.6-alpine3.7
RUN apk add --no-cache --update \
python3 python3-dev gcc \
gfortran musl-dev g++ \
libffi-dev openssl-dev \
libxml2 libxml2-dev \
libxslt libxslt-dev \
libjpeg-turbo-dev zlib-dev
RUN pip install --upgrade pip
ADD requirements.txt .
RUN pip install -r requirements.txt
The requirements.txt
numpy==1.17.1
pandas==0.25.1
EDIT:
Add the following (code snippet below) to the Dockerfile, before the upgrade pip RUN command. It is critical to the successful installation of pandas as pointed out by Bishwas Mishra in a comment.
RUN pip install --upgrade cython
Alpine don't contain build tools by default. Install build tool and create symbolic link for locale:
$ apk add --update curl gcc g++
$ ln -s /usr/include/locale.h /usr/include/xlocale.h
$ pip install numpy
Based on https://wired-world.com/?p=100
Using a new version of python that is not yet supported with pandas will result in problems.
I found it does not work with a development version of Python:
FROM python:3.9.0a6-buster
RUN apt-get update && \
apt-get -y install python3-pandas
COPY requirements.txt ./
RUN pip3 install --no-cache-dir -r
requirements.txt:
numpy==1.18
pandas
I found it DOES work with an officially released version of Python:
FROM python:3.8-buster
You're probably going to be better off building from a pandas image instead of base python. This will make iteration must faster and easier, because you won't ever have to reinstall pandas. I like amancevince/pandas ( https://hub.docker.com/r/amancevice/pandas/tags ). There are Alpine and Debian images available for every pandas tag, although I think they may all be python 3.7 now.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With