Outline: I need to use <code>scikit-image</code> inside some AWS lambda functions, so I'm looking to build a custom AWS lambda layer containing <code>scikit-image</code>. My questions in general should apply to any python module, notably scikit-learn, or any custom layer in general I think. <hr> Background: After much googling and reading it seems the best way to do that is to use docker to run the AWS lambda runtime locally, and then inside there install/compile scikit-image (or whichever module you're looking for). After that's done, you can upload/install it to AWS as a custom layer. This is conceptually pretty simple, but I'm struggling a bit with best-practices way to do this. I've got this working, but not sure I'm doing it the best/right/optimal/secure way ... there are million all-slightly-different blog posts about this, and the AWS docs themselves are (IMHO) too detailed but skip over some of the basic questions. I've been trying to basically follow two good medium posts, here and here ...kudos to those guys. <hr> My main questions are: <ol> <li>Where is the best place to find the latest AWS AMI docker image?</li> </ol> There are multiple (even on amazon itself) multiple locations/versions etc for what is supposedly the latest image. eg https://docs.aws.amazon.com/lambda/latest/dg/lambda-runtimes.html, or https://cdn.amazonlinux.com/os-images/2.0.20190823.1/. ..This is ignoring the multitude of non-amazon github hosted possibilities, such as <code>lambci/lambda:build-python3.6</code> from medium posts here, or <code>onema/amazonlinux4lambda</code> from here. I'd prefer to use an amazon provided docker image, for both security and up-to-date'ness. <ol start="2"> <li> Is the AWS lambda runtime here, which links to this AMI, a docker image? If so (or not) how do you download it to run it locally?</li> <li> How do you ensure you know when you might need to rebuild a layer, because the AWS lambda runtime is changed by amazon and that breaks you're layer using an older runtime?</li> <li> Is it better to build (compile in the case of scikit-image) the pip installed module inside of the docker AIM container, or simply just to tell pip to download the pre-built version and hope/trust it will get the compiled libs that are the best for the AMI you're running?</li> </ol> Basically here I'm concerned about stability and performance. I'd like to ensure that the compiled libraries for scikit-image in this case are as optimized as possible for the AMI container. <ol start="5"> <li> Is it better to just download and use AWS's SAM to do all of this? (looks like overkill and complicated, but it does look like it takes care of ensuring you're using the 'correct' AMI docker container all the time)</li> <li> Are there any (good, trustable) repo's of pre-built lambda layers around (that might make all this a moot point)? I looked but couldn't find any.</li> </ol> ...thanks for any advice, thoughts and comments!

Interesting couple of days figuring this out. ...hopefully the answer below will be some help to anyone struggling to figure out how to make a custom layer (for python but also other languages). <hr> Where is the best place to find the latest AWS AMI docker image? The answer, as Greg above points out, for where is the "right" docker image to use to build layers is here: <code>lambci/lambda:build-python3.7</code>. That is the official SAM repo for the docker images they use. The full list for all AWS lambda runtime environments, not just python, is here <hr> What's the best way to build your own AWS lambda layer? ...What's the best way to build a custom python module layer? The best way I found, to date, is to use AWS's SAM in combination with some tweaks I used from a great blog here. The tweaks are needed because (at the time I'm writing this) AWS SAM lets you define your layers, but won't actually build them for you. ...See this request from the SAM group's github. I'm not going to try to explain this in huge detail here - instead please check out the bryson3gps blog. He explains it well, and all the credit to him.* <hr> <h3>OK, a quick background on the process to use:</h3> At present, AWS SAM won't build your layer for you. Meaning, if you define a requirement.txt for a set of modules to install in a layer, it won't actually install/build them into a local directory ready to upload to AWS (as it does if you use it to define a lambda function). But, if you define a layer in SAM, it will package (zip everything and upload to S3) and deploy (define it within AWS Cloud with ARN etc etc so it can be used) that layer for you. <hr> <h3>The way to get SAM to build your layers too</h3> The hack, at present, to "fool" SAM into also building your layer for you, from the bryson3Gps blog here, is to <ol> <li>Define a dummy AWS lambda function template in SAM. Then for that function, make a pip <code>requirement.txt</code> that SAM will use during the build to load the modules you want into your layer. You won't actually use this function for anything.</li> </ol> This entails making a SAM <code>template.yaml</code> file that defines a basic function. Check out the SAM tutorial, then look at bryson3gps' blog. It's pretty easy. <ol start="2"> <li>Define an AWS layer in the same <code>template.yaml</code> file. Again not too hard - check out the blog</li> <li>In the SAM spec's for your layer definition, set <code>ContentUri</code> (ie where it looks for the files/directories to zip and upload to AWS) to the build location for the function you defined in (1).</li> </ol> So, when you use <code>sam build</code>, it will build the function for you (ie process <code>requirements.txt</code> for the function) and put the resulting function packages in a directory to later zip up and send to AWS. But (this is the key) the layer you defined has it's <code>ContentUri</code> pointing to the same directory sam build used to create the directory for the (dummy) function. So then, when you tell SAM to package (send to S3) and deploy (configure with AWS) for the template as a whole, it will upload/create the layer that you defined, but it will also use the correct contents for the layer that got built for the (dummy) function. It works well. <h3>A couple of extra tips</h3> 1 In bryson3gps' blog, he points out that this method doesn't put the layers package in the correct location in the lambda AMI directory for them to be found by default (for python that is /opt/python). Instead they are placed in /opt. His way around this is to add /opt to the sys.path in your lambda scripts prior to importing: <pre class="prettyprint"><code>sys.path.append('/opt') import <a module in your layer> </code></pre> Instead of doing that, prior to <code>sam package</code> uploading to S3 (after <code>sam build</code>), you can go into the appropriate <code>.aws-sam/<your package subdir></code> directory and move everything into a new /python directory within that package directory. This results in the layer modules being placed in /opt/python correctly, instead of just /opt. <pre class="prettyprint"><code>cd .aws-sam/<wherever you package is>/ mkdir .python mv * .python mv .python python </code></pre> 2 If you're making a python layer with compiled code (eg scikit-image that I'm using) make sure you use <code>sam build -u</code> (with the -u flag). That will make sure the build (pip'ing requirements.txt) will happen inside a docker container matching the AWS lambda runtime, and so will DL the correct lib's) for the runtime. 3 If you're including any modules that depend on numpy or scipy, then after sam build -u, but before package/deploy, make sure you go into the appropriate <code>.aws-sam/<your package></code> directory that is built and remove the numpy and scipy modules that the dependency will install <pre class="prettyprint"><code>cd .aws-sam/<wherever you package is>/ rm -r numpy* rm -f scipy* </code></pre> Instead you should specify to use the AWS supplied numpy/scipy layer in your lambda function. I couldn't find a way to tell SAM to run pip with --no_dep, so have to do this manually

As of v0.50.0, the sam cli has direct support for building layers. You decorate your <code>AWS::Serverless::LayerVersion</code> resource with metadata about which runtime strategy to use. <pre class="prettyprint"><code>MyLayer: Type: AWS::Serverless::LayerVersion Properties: Description: Layer description ContentUri: 'my_layer/' CompatibleRuntimes: - python3.8 Metadata: BuildMethod: python3.8 </code></pre>

I'm not an expert at this, but I happened to have the very same set of questions on the same day. However I can answer question #1 and #2. Taking them out of order: 2) An AMI is not a docker image, its for use in an EC2 instance. 1) Here is how I got the appropriate docker image: I installed SAM cli and executed the following commands: sam init --runtime python3.7 (sets up hello world example) sam build -u (builds app, -u means use a container) Output from sam build -u: <blockquote> Fetching lambci/lambda:build-python3.7 Docker container image </blockquote> So there you go. You can either get the image from dockerhub directly or if you have SAM cli installed, you can execute "sam build -u". Now that you have the image, you don't have to follow the full SAM workflow, if you don't want the overhead.

Build custom AWS Lambda layer for Scikit-image

Tags:

python-3.x

aws-lambda

scikit-learn

scikit-image

Outline: I need to use scikit-image inside some AWS lambda functions, so I'm looking to build a custom AWS lambda layer containing scikit-image.

My questions in general should apply to any python module, notably scikit-learn, or any custom layer in general I think.

Background: After much googling and reading it seems the best way to do that is to use docker to run the AWS lambda runtime locally, and then inside there install/compile scikit-image (or whichever module you're looking for). After that's done, you can upload/install it to AWS as a custom layer.

This is conceptually pretty simple, but I'm struggling a bit with best-practices way to do this. I've got this working, but not sure I'm doing it the best/right/optimal/secure way ... there are million all-slightly-different blog posts about this, and the AWS docs themselves are (IMHO) too detailed but skip over some of the basic questions.

I've been trying to basically follow two good medium posts, here and here ...kudos to those guys.

My main questions are:

Where is the best place to find the latest AWS AMI docker image?

There are multiple (even on amazon itself) multiple locations/versions etc for what is supposedly the latest image. eg https://docs.aws.amazon.com/lambda/latest/dg/lambda-runtimes.html, or https://cdn.amazonlinux.com/os-images/2.0.20190823.1/.

..This is ignoring the multitude of non-amazon github hosted possibilities, such as lambci/lambda:build-python3.6 from medium posts here, or onema/amazonlinux4lambda from here.

I'd prefer to use an amazon provided docker image, for both security and up-to-date'ness.

Is the AWS lambda runtime here, which links to this AMI, a docker image? If so (or not) how do you download it to run it locally?
How do you ensure you know when you might need to rebuild a layer, because the AWS lambda runtime is changed by amazon and that breaks you're layer using an older runtime?
Is it better to build (compile in the case of scikit-image) the pip installed module inside of the docker AIM container, or simply just to tell pip to download the pre-built version and hope/trust it will get the compiled libs that are the best for the AMI you're running?

Basically here I'm concerned about stability and performance. I'd like to ensure that the compiled libraries for scikit-image in this case are as optimized as possible for the AMI container.

Is it better to just download and use AWS's SAM to do all of this? (looks like overkill and complicated, but it does look like it takes care of ensuring you're using the 'correct' AMI docker container all the time)
Are there any (good, trustable) repo's of pre-built lambda layers around (that might make all this a moot point)? I looked but couldn't find any.

...thanks for any advice, thoughts and comments!

375

asked Oct 14 '19 01:10

Richard

3 Answers

Interesting couple of days figuring this out. ...hopefully the answer below will be some help to anyone struggling to figure out how to make a custom layer (for python but also other languages).

Where is the best place to find the latest AWS AMI docker image?

The answer, as Greg above points out, for where is the "right" docker image to use to build layers is here: lambci/lambda:build-python3.7. That is the official SAM repo for the docker images they use.

The full list for all AWS lambda runtime environments, not just python, is here

What's the best way to build your own AWS lambda layer? ...What's the best way to build a custom python module layer?

The best way I found, to date, is to use AWS's SAM in combination with some tweaks I used from a great blog here.

The tweaks are needed because (at the time I'm writing this) AWS SAM lets you define your layers, but won't actually build them for you. ...See this request from the SAM group's github.

I'm not going to try to explain this in huge detail here - instead please check out the bryson3gps blog. He explains it well, and all the credit to him.*

OK, a quick background on the process to use:

At present, AWS SAM won't build your layer for you.

Meaning, if you define a requirement.txt for a set of modules to install in a layer, it won't actually install/build them into a local directory ready to upload to AWS (as it does if you use it to define a lambda function).

But, if you define a layer in SAM, it will package (zip everything and upload to S3) and deploy (define it within AWS Cloud with ARN etc etc so it can be used) that layer for you.

The way to get SAM to build your layers too

The hack, at present, to "fool" SAM into also building your layer for you, from the bryson3Gps blog here, is to

Define a dummy AWS lambda function template in SAM. Then for that function, make a pip requirement.txt that SAM will use during the build to load the modules you want into your layer. You won't actually use this function for anything.

This entails making a SAM template.yaml file that defines a basic function. Check out the SAM tutorial, then look at bryson3gps' blog. It's pretty easy.

Define an AWS layer in the same template.yaml file. Again not too hard - check out the blog
In the SAM spec's for your layer definition, set ContentUri (ie where it looks for the files/directories to zip and upload to AWS) to the build location for the function you defined in (1).

So, when you use sam build, it will build the function for you (ie process requirements.txt for the function) and put the resulting function packages in a directory to later zip up and send to AWS.

But (this is the key) the layer you defined has it's ContentUri pointing to the same directory sam build used to create the directory for the (dummy) function.

So then, when you tell SAM to package (send to S3) and deploy (configure with AWS) for the template as a whole, it will upload/create the layer that you defined, but it will also use the correct contents for the layer that got built for the (dummy) function.

It works well.

A couple of extra tips

In bryson3gps' blog, he points out that this method doesn't put the layers package in the correct location in the lambda AMI directory for them to be found by default (for python that is /opt/python). Instead they are placed in /opt.

His way around this is to add /opt to the sys.path in your lambda scripts prior to importing:

sys.path.append('/opt')
import <a module in your layer>

Instead of doing that, prior to sam package uploading to S3 (after sam build), you can go into the appropriate .aws-sam/<your package subdir> directory and move everything into a new /python directory within that package directory. This results in the layer modules being placed in /opt/python correctly, instead of just /opt.

cd .aws-sam/<wherever you package is>/
mkdir .python
mv * .python
mv .python python

If you're making a python layer with compiled code (eg scikit-image that I'm using) make sure you use sam build -u (with the -u flag).

That will make sure the build (pip'ing requirements.txt) will happen inside a docker container matching the AWS lambda runtime, and so will DL the correct lib's) for the runtime.

If you're including any modules that depend on numpy or scipy, then after sam build -u, but before package/deploy, make sure you go into the appropriate .aws-sam/<your package> directory that is built and remove the numpy and scipy modules that the dependency will install

cd .aws-sam/<wherever you package is>/
rm -r numpy*
rm -f scipy*

Instead you should specify to use the AWS supplied numpy/scipy layer in your lambda function.

I couldn't find a way to tell SAM to run pip with --no_dep, so have to do this manually

147

answered Oct 18 '22 21:10

Richard

As of v0.50.0, the sam cli has direct support for building layers. You decorate your AWS::Serverless::LayerVersion resource with metadata about which runtime strategy to use.

MyLayer:
 Type: AWS::Serverless::LayerVersion
 Properties:
   Description: Layer description
   ContentUri: 'my_layer/'
   CompatibleRuntimes:
    - python3.8
 Metadata:
   BuildMethod: python3.8

answered Oct 18 '22 22:10

speshak

I'm not an expert at this, but I happened to have the very same set of questions on the same day. However I can answer question #1 and #2. Taking them out of order:
2) An AMI is not a docker image, its for use in an EC2 instance.

1) Here is how I got the appropriate docker image:

I installed SAM cli and executed the following commands:

sam init --runtime python3.7 (sets up hello world example)
sam build -u (builds app, -u means use a container)

Output from sam build -u:

Fetching lambci/lambda:build-python3.7 Docker container image

So there you go. You can either get the image from dockerhub directly or if you have SAM cli installed, you can execute "sam build -u". Now that you have the image, you don't have to follow the full SAM workflow, if you don't want the overhead.

answered Oct 18 '22 22:10

Gregas

Related questions
                            
                                Remove empty entries during string split
                            
                                Python - How can I completely uninstall Anaconda on Windows 10?
                            
                                How to import firefox cookies to python requests
                            
                                Python (NLTK) - more efficient way to extract noun phrases?
                            
                                How does PyTorch module do the back prop
                            
                                How to get the list of all built in functions in Python
                            
                                Using Tweepy to Access Twitter's Premium API
                            
                                Python datetime to epoch
                            
                                TypeError: Type aliases cannot be used with isinstance()
                            
                                Python output from print(print(print('aaa')))
                            
                                Python Write a JSON temporary file from a dictionary
                            
                                How to get pixel coordinates if I know color(RGB)?
                            
                                Type annotation for function returning a lambda
                            
                                Idiomatic way to call method on all objects in a list of objects Python 3
                            
                                Get cookie using aiohttp
                            
                                C-Python asyncio: running discord.py in a thread
                            
                                Mypy: annotating a variable with a class type
                            
                                Convert HTML Table to Pandas Data Frame in Python
                            
                                how to get response_time and response_size while using aiohttp
                            
                                How do I force pip to install a package directly from the Internet not local cache?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Build custom AWS Lambda layer for Scikit-image

Tags:

python-3.x

aws-lambda

scikit-learn

scikit-image

Richard

People also ask

3 Answers

OK, a quick background on the process to use:

The way to get SAM to build your layers too

A couple of extra tips

Richard

speshak

Gregas

Recent Activity

Donate For Us