Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I use an external python library in AWS Glue?

First stack overflow question here. Hope I do this correctly:

I need to use an external python library in AWS glue. "Openpyxl" is the name of the library.

I follow these directions: https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-python-libraries.html

However, after I have my zip file saved in the correct s3 location and point my glue job to that location, I'm not sure what to actually write in the script.

I tried your typical Import openpyxl , but that just returns the following error:

ImportError: No module named openpyxl

Obviously I don't know what to do here - also relatively new to programming so I'm not sure if this is a noob question or what. Thanks in advance!

like image 949
Marlon Holland Avatar asked Dec 14 '22 10:12

Marlon Holland


1 Answers

It depends if the job is Spark or Python Shell. For Spark you just need to zip the library and then when you point the job to the library S3 path, the job will import it. You just need to make sure that the zip contains this file: __init__.py

For example, for the library you are trying to import, if you download it from https://pypi.org/project/openpyxl/#files, you can zip the folder openpyxl inside the openpyxl-3.0.0.tar.gz, and store it in S3.


On the other hand, if it is a Python Shell job, a zip file will not work. You will need to create an egg file from the library. If you are using this version openpyxl-3.0.0, then you can download it from that same website, extract everything, and run the command python setup.py bdist_egg or python3 instead of python if you use python3 instead.

This will generate an egg file inside dist folder which is also generated. You just need to put that egg file in S3 and point the Glue Job Python Libraries to that path.

If you already have the library and for some reason you don't have the setup.py, then you must create it in order to run the command to generate the egg file. Please refer to http://www.blog.pythonlibrary.org/2012/07/12/python-101-easy_install-or-how-to-create-eggs/. There you can find an example.

like image 146
Pedro Pimenta Avatar answered Dec 30 '22 06:12

Pedro Pimenta