Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Packaging Python dependencies in subdirectory for AWS Lambda

I came across an article on serverlesscode.com about building Python 3 apps for AWS Lambda that recommends using pip (or pip3) to install dependencies in a /vendored subdirectory. I like this idea as it keeps the file structure clean, but I'm having some issues achieving it.

I'm using Serverless Framework and my modules are imported in my code in the normal way, e.g. from pynamodb.models import Model

I've used the command pip install -t vendored/ -r requirements.txt to install my various dependencies (per requirements.txt) in the subdirectory, which seems to work as expected - I can see all modules installed in the subdirectory.

When the function is called, however, I get the error Unable to import module 'handler': No module named 'pynamodb' (where pynamodb is one of the installed modules).

I can resolve this error by changing my pip installation to the project root, i.e. not in the /vendored folder (pip install -t ./ -r requirements.txt). This installs exactly the same files.

There must be a configuration that I'm missing that points to the subfolder, but Googling hasn't revealed whether I need to import my modules in a different way, or if there is some other global config I need to change.

To summarise: how can I use Pip to install my dependencies in a subfolder within my project?

Edit: noting tkwargs' good suggestion on the use of the serverless plugin for packaging, it would still be good to understand how this might be done without venv, for example. The primary purpose is not specifically to make packaging easier (it's pretty easy as-is with pip), but to keep my file structure cleaner by avoiding additional folders in the root.

like image 738
Harry Avatar asked Apr 15 '18 08:04

Harry


1 Answers

I've seen some people using the sys module in their lambda function's code to add the subdirectory, vendored in this case, to their python path... I'm not a fan of that as a solution because it would mean needing to do that for every single lambda function and add the need for extra boiler plate code. The solution I ended up using is to modify the PYTHONPATH runtime environment variable to include my subdirectories. For example, in my serverless.yml I have:

provider:
  environment:
    PYTHONPATH: '/var/task/vendored:/var/runtime'

By setting this as an environment variable at this level it will apply to every lambda function you are deploying in your serverless.yml -- you could also specify it at a per lambda function level if for some reason you didn't want it applied to all of them.

I wasn't sure how to self reference the existing value of PYTHONPATH to ensure I wasn't incorrectly overwriting it while in the process of adding my custom path "/var/task/vendored"... would love to know if anyone else has.

like image 139
kinzleb Avatar answered Oct 22 '22 17:10

kinzleb