Outline: I need to use scikit-image
inside some AWS lambda functions, so I'm looking to build a custom AWS lambda layer containing scikit-image
.
My questions in general should apply to any python module, notably scikit-learn, or any custom layer in general I think.
Background: After much googling and reading it seems the best way to do that is to use docker to run the AWS lambda runtime locally, and then inside there install/compile scikit-image (or whichever module you're looking for). After that's done, you can upload/install it to AWS as a custom layer.
This is conceptually pretty simple, but I'm struggling a bit with best-practices way to do this. I've got this working, but not sure I'm doing it the best/right/optimal/secure way ... there are million all-slightly-different blog posts about this, and the AWS docs themselves are (IMHO) too detailed but skip over some of the basic questions.
I've been trying to basically follow two good medium posts, here and here ...kudos to those guys.
My main questions are:
There are multiple (even on amazon itself) multiple locations/versions etc for what is supposedly the latest image. eg https://docs.aws.amazon.com/lambda/latest/dg/lambda-runtimes.html, or https://cdn.amazonlinux.com/os-images/2.0.20190823.1/.
..This is ignoring the multitude of non-amazon github hosted possibilities, such as lambci/lambda:build-python3.6
from medium posts here, or onema/amazonlinux4lambda
from here.
I'd prefer to use an amazon provided docker image, for both security and up-to-date'ness.
Basically here I'm concerned about stability and performance. I'd like to ensure that the compiled libraries for scikit-image in this case are as optimized as possible for the AMI container.
...thanks for any advice, thoughts and comments!
In this post, I explain how to use AWS Lambda layers and extensions with Lambda functions packaged and deployed as container images. Previously, Lambda functions were packaged only as .zip archives. This includes functions created in the AWS Management Console. You can now also package and deploy Lambda functions as container images.
FROM public.ecr.aws/lambda/nodejs: 14 # Alternatively, you can pull the base image from Docker Hub: amazon/aws-lambda-nodejs:12 # Assumes your function is named "app.js", and there is a package.json file in the app directory.
To build a container image for a new Lambda function, you can start with an AWS base image for Lambda. AWS periodically provides updates to the AWS base images for Lambda.
The final result is something like that: An AWS Lambda layer can be created an uploaded as a zip archive onto AWS Lambda by selecting from AWS console: Lambda > Layers > Create (button on top right corner).
Interesting couple of days figuring this out. ...hopefully the answer below will be some help to anyone struggling to figure out how to make a custom layer (for python but also other languages).
Where is the best place to find the latest AWS AMI docker image?
The answer, as Greg above points out, for where is the "right" docker image to use to build layers is here: lambci/lambda:build-python3.7
. That is the official SAM repo for the docker images they use.
The full list for all AWS lambda runtime environments, not just python, is here
What's the best way to build your own AWS lambda layer? ...What's the best way to build a custom python module layer?
The best way I found, to date, is to use AWS's SAM in combination with some tweaks I used from a great blog here.
The tweaks are needed because (at the time I'm writing this) AWS SAM lets you define your layers, but won't actually build them for you. ...See this request from the SAM group's github.
I'm not going to try to explain this in huge detail here - instead please check out the bryson3gps blog. He explains it well, and all the credit to him.*
At present, AWS SAM won't build your layer for you.
Meaning, if you define a requirement.txt for a set of modules to install in a layer, it won't actually install/build them into a local directory ready to upload to AWS (as it does if you use it to define a lambda function).
But, if you define a layer in SAM, it will package (zip everything and upload to S3) and deploy (define it within AWS Cloud with ARN etc etc so it can be used) that layer for you.
The hack, at present, to "fool" SAM into also building your layer for you, from the bryson3Gps blog here, is to
requirement.txt
that SAM will use during the build to load the modules you want into your layer. You won't actually use this function for anything.This entails making a SAM template.yaml
file that defines a basic function. Check out the SAM tutorial, then look at bryson3gps' blog. It's pretty easy.
Define an AWS layer in the same template.yaml
file. Again not too hard - check out the blog
In the SAM spec's for your layer definition, set ContentUri
(ie where it looks for the files/directories to zip and upload to AWS) to the build location for the function you defined in (1).
So, when you use sam build
, it will build the function for you (ie process requirements.txt
for the function) and put the resulting function packages in a directory to later zip up and send to AWS.
But (this is the key) the layer you defined has it's ContentUri
pointing to the same directory sam build used to create the directory for the (dummy) function.
So then, when you tell SAM to package (send to S3) and deploy (configure with AWS) for the template as a whole, it will upload/create the layer that you defined, but it will also use the correct contents for the layer that got built for the (dummy) function.
It works well.
1
In bryson3gps' blog, he points out that this method doesn't put the layers package in the correct location in the lambda AMI directory for them to be found by default (for python that is /opt/python). Instead they are placed in /opt.
His way around this is to add /opt to the sys.path in your lambda scripts prior to importing:
sys.path.append('/opt')
import <a module in your layer>
Instead of doing that, prior to sam package
uploading to S3 (after sam build
), you can go into the appropriate .aws-sam/<your package subdir>
directory and move everything into a new /python directory within that package directory. This results in the layer modules being placed in /opt/python correctly, instead of just /opt.
cd .aws-sam/<wherever you package is>/
mkdir .python
mv * .python
mv .python python
2
If you're making a python layer with compiled code (eg scikit-image that I'm using) make sure you use sam build -u
(with the -u flag).
That will make sure the build (pip'ing requirements.txt) will happen inside a docker container matching the AWS lambda runtime, and so will DL the correct lib's) for the runtime.
3
If you're including any modules that depend on numpy or scipy, then after sam build -u, but before package/deploy, make sure you go into the appropriate .aws-sam/<your package>
directory that is built and remove the numpy and scipy modules that the dependency will install
cd .aws-sam/<wherever you package is>/
rm -r numpy*
rm -f scipy*
Instead you should specify to use the AWS supplied numpy/scipy layer in your lambda function.
I couldn't find a way to tell SAM to run pip with --no_dep, so have to do this manually
As of v0.50.0, the sam cli has direct support for building layers. You decorate your AWS::Serverless::LayerVersion
resource with metadata about which runtime strategy to use.
MyLayer:
Type: AWS::Serverless::LayerVersion
Properties:
Description: Layer description
ContentUri: 'my_layer/'
CompatibleRuntimes:
- python3.8
Metadata:
BuildMethod: python3.8
I'm not an expert at this, but I happened to have the very same set of questions on the same day. However I can answer question #1 and #2. Taking them out of order:
2) An AMI is not a docker image, its for use in an EC2 instance.
1) Here is how I got the appropriate docker image:
I installed SAM cli and executed the following commands:
sam init --runtime python3.7 (sets up hello world example)
sam build -u (builds app, -u means use a container)
Output from sam build -u:
Fetching lambci/lambda:build-python3.7 Docker container image
So there you go. You can either get the image from dockerhub directly or if you have SAM cli installed, you can execute "sam build -u". Now that you have the image, you don't have to follow the full SAM workflow, if you don't want the overhead.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With