AWS Glue jobs not writing to S3

Tags:

aws-glue

I have just been playing around with Glue but have yet to get it to successfully create a new table in an existing S3 bucket. The job will execute without error but there is never any output in S3.

Here's what the auto generated code is:

glueContext.write_dynamic_frame.from_options(frame = applymapping1, 
connection_type = "s3", connection_options = {"path": 
"s3://glueoutput/output/"}, format = "json", transformation_ctx = 
"datasink2")

Have tried all variations of this - with name of file (that doesn't exist yet), in root folder of bucket, trailing slash and without. The role being used has full access to S3. Tried creating buckets in different regions. No file is ever created though. Again console says its successful.

469

asked Sep 21 '17 05:09

billobo

2 Answers

As @Drellgor suggests in his comment to the previous answer, make sure you disabled "Job Bookmarks" unless you definitely don't want to process old files.

From the documentation:

"AWS Glue tracks data that has already been processed during a previous run of an ETL job by persisting state information from the job run. This persisted state information is called a job bookmark. Job bookmarks help AWS Glue maintain state information and prevent the reprocessing of old data."

answered Sep 24 '22 06:09

Sinan Erdem

your code is correct, just verify if there is any data at all in applymapping1 DF? you check with this command : applymapping1.toDF().show()

answered Sep 26 '22 06:09

letstry

Related questions
                            
                                AmazonS3 GetPreSignedUrlRequest max Expires date
                            
                                Duplicate file in Amazon S3
                            
                                everytime push to heroku, images is not showed ,paperclip
                            
                                How to save media files on AWS with multiple EC2 instances on AWS
                            
                                AWS S3: Force File Download using 'response-content-disposition'
                            
                                Fine Uploader to S3 bucket getting 405 Method Not Allowed error
                            
                                Setting S3 Bucket Policies
                            
                                S3 redirect 302 object with s3cmd
                            
                                Django amazon s3 SuspiciousOperation
                            
                                `@@{' is not allowed as a class variable name (SyntaxError) while using s3 gem + rails 4.1.5
                            
                                IntelliJ can't resolve com.amazonaws symbols
                            
                                Laravel and AWS PHP SDK - Unable to delete a local file after it was uploaded to S3
                            
                                How to deploy an artifact into Amazon S3 with Maven?
                            
                                Why my Serverless Lambda unable to access S3 bucket and items?
                            
                                How do I access AWS S3 via PHP?
                            
                                How to copy folder from S3 to elastic beanstalk instance on instance creation
                            
                                How to Use AWS S3 C++ SDK TransferManager DownloadFile Callback
                            
                                I'm uploading data from my Swift app to Amazon S3 and it drains battery like nothing else. How can this be avoided?
                            
                                How can I sync data in S3 between a Beijing(China) bucket and a global one?
                            
                                AWS S3 client race condition solutions

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With