Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating a Glue job with AWS CDK (python) fails

Tags:

python

aws-cdk

I'm using Python wrappers for CDK to create a Glue job. The command attribute requires an object of type IResolvable | Job­Command­Property. I tried to put a JobCommandProperty object here but I'm getting an exception.

I created a JobCommandProperty object. I was looking for a .builder()function somewhere (similar than in the Java API), but couldn't find one.

from aws_cdk import (
    aws_glue as glue,
    aws_iam as iam,
    core
)

class ScheduledGlueJob (core.Stack):

    def __init__(self, scope: core.Construct, id: str, **kwargs) -> None:
        super().__init__(scope, id, **kwargs)

        policy_statement = iam.PolicyStatement(
                actions=['logs:*','s3:*','ec2:*','iam:*','cloudwatch:*','dynamodb:*','glue:*']
            )

        policy_statement.add_all_resources()

        glue_job_role = iam.Role(
            self,
            'Glue-Job-Role',
            assumed_by=iam.ServicePrincipal('glue.amazonaws.com')
        ).add_to_policy(
            policy_statement
        )

        job = glue.CfnJob(
            self,
            'glue-test-job',
            role=glue_job_role,
            allocated_capacity=10,
            command=glue.CfnJob.JobCommandProperty(
                name='glueetl',
                script_location='s3://my-bucket/glue-scripts/job.scala'
            ))

The error message is this:

$cdk synth
Traceback (most recent call last):
  File "app.py", line 30, in <module>
    glue_job = ScheduledGlueJob(app, 'Cronned-Glue-Job')
  File "/Users/d439087/IdeaProjects/ds/test_cdk/.env/lib/python3.7/site-packages/jsii/_runtime.py", line 66, in __call__
    inst = super().__call__(*args, **kwargs)
  File "/Users/d439087/IdeaProjects/ds/test_cdk/glue/scheduled_job.py", line 33, in __init__
    script_location='s3://my-bucket/glue-scripts/job.scala'
  File "/Users/d439087/IdeaProjects/ds/test_cdk/.env/lib/python3.7/site-packages/jsii/_runtime.py", line 66, in __call__
    inst = super().__call__(*args, **kwargs)
  File "/Users/d439087/IdeaProjects/ds/test_cdk/.env/lib/python3.7/site-packages/aws_cdk/aws_glue/__init__.py", line 2040, in __init__
    jsii.create(CfnJob, self, [scope, id, props])
  File "/Users/d439087/IdeaProjects/ds/test_cdk/.env/lib/python3.7/site-packages/jsii/_kernel/__init__.py", line 208, in create
    overrides=overrides,
  File "/Users/d439087/IdeaProjects/ds/test_cdk/.env/lib/python3.7/site-packages/jsii/_kernel/providers/process.py", line 331, in create
    return self._process.send(request, CreateResponse)
  File "/Users/d439087/IdeaProjects/ds/test_cdk/.env/lib/python3.7/site-packages/jsii/_kernel/providers/process.py", line 316, in send
    raise JSIIError(resp.error) from JavaScriptError(resp.stack)
jsii.errors.JSIIError: Expected 'string', got true (boolean)

Maybe someone has a working CDK (python) example to create a CfnJobobject?

like image 865
David Avatar asked Jul 23 '19 13:07

David


People also ask

Does AWS Glue support Python?

AWS Glue supports an extension of the PySpark Python dialect for scripting extract, transform, and load (ETL) jobs. This section describes how to use Python in ETL scripts and with the AWS Glue API.

How does AWS Glue handle ETL errors?

Q: How does AWS Glue handle ETL errors? AWS Glue monitors job event metrics and errors, and pushes all notifications to Amazon CloudWatch. With Amazon CloudWatch, you can configure a host of actions that can be triggered based on specific notifications from AWS Glue.


2 Answers

Nevermind, the role attribute has to be of type string, I got confused by the JSII error message.

like image 184
David Avatar answered Oct 03 '22 09:10

David


glue_job_role variable's type is no longer Role because you have added .add_to_policy to it. below code should work.

glue_job_role = iam.Role(
            self,
            'Glue-Job-Role',
            assumed_by=iam.ServicePrincipal('glue.amazonaws.com')
        )
glue_job_role.add_to_policy(
            policy_statement
        )
job = glue.CfnJob(
            self,
            'glue-test-job',
            role=glue_job_role.arn,
            allocated_capacity=10,
            command=glue.CfnJob.JobCommandProperty(
                name='glueetl',
                script_location='s3://my-bucket/glue-scripts/job.scala'
            ))
like image 30
Ranvijay Singh Avatar answered Oct 03 '22 09:10

Ranvijay Singh