Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

conda environment to AWS Lambda

I would like to set up a Python function I've written on AWS Lambda, a function that depends on a bunch of Python libraries I have already collected in a conda environment.

To set this up on Lambda, I'm supposed to zip this environment up, but the Lambda docs only give instructions for how to do this using pip/VirtualEnv. Does anyone have experience with this?

like image 270
RoyalTS Avatar asked Sep 28 '16 14:09

RoyalTS


People also ask

How do I use AWS Lambda with Python?

To create your own AWS Lambda layer with any Python package that is needed as a dependency by a AWS Lambda function, follow these steps (the exact commands are given in the article): Use docker-lambda to run pip install and to download all required dependencies into a folder named python.

How to install the Python packages in conda?

To install the Python packages in the correct Conda environment, first activate the environment before running pip install or conda install from the terminal. sh-4.2$ source activate python3 (python3) sh-4.2$ pip install theano (python3) sh-4.2$ source deactivate (JupyterSystemEnv) sh-4.2$

How do I set environment variables in AWS Lambda?

let region = process.env.AWS_REGION Lambda stores environment variables securely by encrypting them at rest. You can configure Lambda to use a different encryption key, encrypt environment variable values on the client side, or set environment variables in an AWS CloudFormation template with AWS Secrets Manager.

How do I find the Conda environment in Anaconda?

Locate the directory for the conda environment in your Anaconda Prompt by running in the command shell %CONDA_PREFIX%. When you run conda activate analytics, the environment variables MY_KEY and MY_FILE are set to the values you wrote into the file. When you run conda deactivate, those variables are erased.


2 Answers

You should use the serverless framework in combination with the serverless-python-requirements plugin. You just need a requirements.txt and the plugin automatically packages your code and the dependencies in a zip-file, uploads everything to s3 and deploys your function. Bonus: Since it can do this dockerized, it is also able to help you with packages that need binary dependencies.

Have a look here (https://serverless.com/blog/serverless-python-packaging/) for a how-to.

From experience I strongly recommend you look into that. Every bit of manual labour for deployment and such is something that keeps you from developing your logic.

Edit 2017-12-17:

Your comment makes sense @eelco-hoogendoorn.

However, in my mind a conda environment is just an encapsulated place where a bunch of python packages live. So, if you would put all these dependencies (from your conda env) into a requirements.txt (and use serverless + plugin) that would solve your problem, no?
IMHO it would essentially be the same as zipping all the packages you installed in your env into your deployment package. That being said, here is a snippet, which does essentially this:

conda env export --name Name_of_your_Conda_env | yq -r '.dependencies[] | .. | select(type == "string")' | sed -E "s/(^[^=]*)(=+)([0-9.]+)(=.*|$)/\1==\3/" > requirements.txt

Unfortunately conda env export only exports the environment in yaml format. The --json flag doesn't work right now, but is supposed to be fixed in the next release. That is why I had to use yq instead of jq. You can install yq using pip install yq. It is just a wrapper around jq to allow it to also work with yaml files.

KEEP IN MIND

Lambda deployment code can only be 50MB in size. Your environment shouldn't be too big.

I have not tried deploying a lambda with serverless + serverless-python-packaging and a requirements.txt created like that and I don't know if it will work.

like image 141
DrEigelb Avatar answered Oct 20 '22 09:10

DrEigelb


The main reason why I use conda is an option not to compile different binary packages myself (like numpy, matplotlib, pyqt, etc.) or compile them less frequently. When you do need to compile something yourself for the specific version of python (like uwsgi), you should compile the binaries with the same gcc version that the python within your conda environment is compiled with - most probably it is not the same gcc that your OS is using, since conda is now using the latest versions of the gcc that should be installed with conda install gxx_linux-64.

This leads us to two situations:

  1. All you dependencies are in pure python and you can actually save a list of list of them using pip freeze and bundle them as it is stated for virtualenv.

  2. You have some binary extensions. In that case, the the binaries from your conda environment will not work with the python used by AWS lambda. Unfortunately, you will need to visit the page describing the execution environment (AMI: amzn-ami-hvm-2017.03.1.20170812-x86_64-gp2), set up the environment, build the binaries for the specific version of built-in python in a separate directory (as well as pure python packages), and then bundle them into a zip-archive.

This is a general answer to your question, but the main idea is that you can not reuse your binary packages, only a list of them.

like image 6
newtover Avatar answered Oct 20 '22 11:10

newtover