Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the best way to run python scripts in AWS?

I have three python scripts, 1.py, 2.py, and 3.py, each having 3 runtime arguments to be passed.

All three python programs are independent of each other. All 3 may run in a sequential manner in a batch or it may happen any two may run depending upon some configuration.

Manual approach:

  1. Create EC2 instance, run python script, shut it down.
  2. Repeat the above step for the next python script.

The automated way would be trigger the above process through lambda and replicate the above process using some combination of services.

What is the best way to implement this in AWS?

like image 910
Parijat Bose Avatar asked May 06 '19 13:05

Parijat Bose


People also ask

How do I run a Python script on AWS?

Run a Python script from GitHubOpen the AWS Systems Manager console at https://console.aws.amazon.com/systems-manager/ . In the navigation pane, choose Run Command. If the AWS Systems Manager home page opens first, choose the menu icon ( ) to open the navigation pane, and then choose Run Command. Choose Run command.

What is the best way to run a Python script?

To run Python scripts with the python command, you need to open a command-line and type in the word python , or python3 if you have both versions, followed by the path to your script, just like this: $ python3 hello.py Hello World! If everything works okay, after you press Enter , you'll see the phrase Hello World!

Can I use Python in AWS?

You can use it to create, configure, and manage AWS services such as Amazon Elastic Compute Cloud (EC2), Amazon Simple Storage Service (S3), and Amazon DynamoDB. Boto3 also provides two types of APIs: low-level APIs and Resource APIs for developers.


1 Answers

AWS Batch has a DAG scheduler, technically you could define job1, job2, job3 and tell AWS Batch to run them in that order. But I wouldn't recommend that route.

For the above to work you would basically need to create 3 docker images. image1, image2, image3. and then put these in ECR (Docker Hub can also work if not using Fargate launch type).

I don't think that makes sense unless each job is bulky has its own runtime that's different from the others.

Instead I would write a Python program that calls 1.py 2.py and 3.py, put that in a Docker image and run a AWS batch job or just ECS Fargate task.

main.py:

import subprocess

exit_code = subprocess.call("python3 /path/to/1.py", shell=True)

# decide if you want call 2.py and so on ...
# 1.py will see the same stdout, stderr as main.py
# with batch and fargate you can retrieve these form cloudwatch logs ...

Now you have a Docker image that just needs to run somewhere. Fargate is fast to startup, bit pricey, has a 10GB max limit on temporary storage. AWS Batch is slow to startup on a cold start, but can use spot instances in your account. You might need to make a custom AMI for AWS batch to work. i.e. if you want more storage.

Note: for anyone who wants to scream at shell=True, both main.py and 1.py came from the same codebase. It's a batch job, not an internet facing API that took that from user request.

like image 50
au kk Avatar answered Oct 09 '22 04:10

au kk