I want to containerise a pipeline of code that was predominantly developed in Python but has a dependency on a model that was trained in R. There are some additional dependencies on the requirements and packages needed for both codebases. How can I create a Docker image that allows me to build a container that will run this Python and R code together?
For context, I have an R code that runs a model (random forest) but it needs to be part of a data pipeline that was built in Python. The Python pipeline performs some functionality first and generates input for the model, then executes the R code with that input, before taking the output to the next stage of the Python pipeline.
So I've created a template for this process by writing a simple test Python function to call an R code ("test_call_r.py" which imports the subprocess package) and need to put this in a Docker container with the necessary requirements and packages for both Python and R.
I have been able to build the Docker container for the Python pipeline itself, but cannot successfully install R and the associated packages alongside the Python requirements. I want to rewrite the Dockerfile to create an image to do this.
From the Dockerhub documentation I can create an image for the Python pipeline using, e.g.,
FROM python:3
WORKDIR /app
COPY requirements.txt /app/
RUN pip install --no-cache-dir -r requirements.txt
COPY . /app
CMD [ "python", "./test_call_r.py" ]
And similarly from Dockerhub I can use a base R image (or Rocker) to create a Docker container that can run a randomForest model, e.g.,
FROM r-base
WORKDIR /app
COPY myscripts /app/
RUN Rscript -e "install.packages('randomForest')"
CMD ["Rscript", "myscript.R"]
But what I need is to create an image that can install the requirements and packages for both Python and R, and execute the codebase to run R from a subprocess in Python. How can I do this?
The first command uses the FROM keyword. It tells docker to create an imagine that will be inherited from an image named: 3.8-slim-buster This command is telling the docker service to use the base image as python:3.8-slim-buster. This is an official Python image. It has all of the required packages that we need to run a Python application.
Here’s how to create a new Docker image from an existing container. You’ll then be able to start another container from that image which will be populated with the filesystem from the first one. The docker commit command is used to take a container and produce a new image from it. It works with either stopped or running containers.
A Dockerfile is a text document that contains the instructions to assemble a Docker image. When we tell Docker to build our image by executing the docker build command, Docker reads these instructions, executes them, and creates a Docker image as a result. Let’s walk through the process of creating a Dockerfile for our application.
Now, run the docker images command to see a list of our local images. You can see that we have two images that start with python-docker. We know they are the same image because if you take a look at the IMAGE ID column, you can see that the values are the same for the two images. Let’s remove the tag that we just created.
The Dockerfile I built for Python and R to run together with their dependencies in this manner is:
FROM ubuntu:latest
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y --no-install-recommends build-essential r-base r-cran-randomforest python3.6 python3-pip python3-setuptools python3-dev
WORKDIR /app
COPY requirements.txt /app/requirements.txt
RUN pip3 install -r requirements.txt
RUN Rscript -e "install.packages('data.table')"
COPY . /app
The commands to build the image, run the container (naming it SnakeR here), and execute the code are:
docker build -t my_image .
docker run -it --name SnakeR my_image
docker exec SnakeR /bin/sh -c "python3 test_call_r.py"
I treated it like a Ubuntu OS and built the image as follows:
This is replicated from my blog post at https://datascienceunicorn.tumblr.com/post/182297983466/building-a-docker-to-run-python-r
I made an image for my personal projects, you could use this if you want: https://github.com/dipayan90/docker-python-r
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With