Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there any way to trigger a AWS Lambda function at the end of an AWS Glue job?

Currently I'm using an AWS Glue job to load data into RedShift, but after that load I need to run some data cleansing tasks probably using an AWS Lambda function. Is there any way to trigger a Lambda function at the end of a Glue job? Lambda functions can be triggered using SNS messages, but I couldn't find a way to send an SNS at the end of the Glue job.

like image 556
dd. Avatar asked Feb 28 '18 16:02

dd.


People also ask

How an AWS Lambda function can be triggered?

You can trigger a Lambda function on DynamoDB table updates by subscribing your Lambda function to the DynamoDB Stream associated with the table. You can associate a DynamoDB Stream with a Lambda function using the Amazon DynamoDB console, the AWS Lambda console, or Lambda's registerEventSource API.


3 Answers

@oreoluwa is right, this can be done using Cloudwatch Events.

From the Cloudwatch dashboard:

  • Click on 'Rules' from the left menu
  • For 'Event Source', choose 'Event Pattern' and in 'Service Name' choose 'Glue'
  • For 'Event Type' choose 'Glue Job State Change'
  • On the right side of the page, in the 'Targets' section, click 'Add Target' -> 'Lambda Function' and then choose your function.

The event you'll get in Lambda will be of the format:

{
    'version': '0',
    'id': 'a9bc90be-xx00-03e0-9bc5-a0a0a0a0a0a0',
    'detail-type': 'GlueJobStateChange',
    'source': 'aws.glue',
    'account': 'xxxxxxxxxx',
    'time': '2018-05-10T16: 17: 03Z',
    'region': 'us-east-2',
    'resources': [],
    'detail': {
        'jobName': 'xxxx_myjobname_yyyy',
        'severity': 'INFO',
        'state': 'SUCCEEDED',
        'jobRunId': 'jr_565465465446788dfdsdf546545454654546546465454654',
        'message': 'Jobrunsucceeded'
    }
}
like image 157
ace Avatar answered Nov 09 '22 19:11

ace


Since AWS Glue has started supporting python, you can probably follow the below path to achieve what you desire. Below sample script shows how to do that -

import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job
import boto3   ## Step-2

## @params: [JOB_NAME]
args = getResolvedOptions(sys.argv, ['JOB_NAME'])

sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
job.init(args['JOB_NAME'], args)

## Do all ETL stuff here

## Once the ETL completes
lambda_client = boto3.client('lambda')  ## Step-3
response = lambda_client.invoke(FunctionName='string')  ## Step-4
  1. Create a python based Glue Job (to perform ETL on Redshift)
  2. In the job script, import boto3 (need to place this package as script library).
  3. Make a connection to lambda using boto3
  4. Invoke lambda function using the boto3 lambda invoke() once the ETL completes.

Please make sure that the role that you are using while creating the Glue job has permissions to invoke lambda functions.

Refer to the Boto3 documentation for lambda here.

like image 25
Kamlesh Gallani Avatar answered Nov 09 '22 19:11

Kamlesh Gallani


No. Currently you can't trigger a lambda function at the end of a Glue job. The reason for this is that this trigger has not yet been provided by AWS in Lambda. If you look at the list of AWS lambda triggers after you create a lambda function, you will see that it has most of AWS services as trigger but not AWS Glue. So, for now, it is not possible but maybe in future.

But I would like to mention that you can actually control the flow of glue scripts using your lambda python script. (I did it using python, I am sure there may be other languages supporting this). My use case was that whenever I upload any object in S3 bucket, it gets lambda function trigger from which I was reading the object file and starting my glue job. And once the status of Glue job was complete, I would write my file back to S3 bucket linked to this Lambda function.

like image 34
CodeHunter Avatar answered Nov 09 '22 19:11

CodeHunter