Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a temporary folder that I can access while using AWS Glue?

Is there a temporary folder that I can access to hold files temporarily while running processes within AWS glue? For example, in Lambda we have access to a /tmp directory as long as the process is executing. Do we have something similar in AWS Glue that we can store files while the job is executing?

like image 221
Leyth G Avatar asked Jan 12 '18 18:01

Leyth G


2 Answers

Are you asking for this? There are a number of argument names that are recognized and used by AWS Glue, that you can use to set up the script environment for your Jobs and JobRuns:

  • --TempDir — Specifies an S3 path to a bucket that can be used as a temporary directory for the Job.

Here is a link, which you can refer.

Hope, this helps.

like image 151
Gourav Dutta Avatar answered Nov 15 '22 03:11

Gourav Dutta


Yes, there is a tmp directory which you can use to move files to and from s3.

s3 = boto3.resource('s3')

--Downloads file to local spark directory tmp

s3.Bucket(bucket_name).download_file(DATA_DIR+file,'tmp/'+file)

And you can also upload files from 'tmp/' to s3.

like image 28
Kishore Avatar answered Nov 15 '22 03:11

Kishore