Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get entire bucket or more than one objects from AWS S3 bucket through Ansible

As far as I know S3 module of Ansible, it can only get an object at once.

My question is that what if I want to download/get entire bucket or more than one object from S3 bucket at once. Is there any hack?

like image 956
Prem Sompura Avatar asked Sep 17 '15 10:09

Prem Sompura


People also ask

Can I download multiple files from S3?

Download multiple files from AWS CloudShell using Amazon S3 Now you need to download the contents of the bucket to your local machine. Because the Amazon S3 console doesn't support the downloading of multiple objects, you need to use the AWS CLI tool that's installed on your local machine.

Is it better to have multiple S3 buckets or one bucket with sub folders?

Simpler Permission with Multiple Buckets If the images are used in different use cases, using multiple buckets will simplify the permissions model, since you can give clients/users bucket level permissions instead of directory level permissions.


3 Answers

I was able to achieve it like so:

  - name: get s3_bucket_items
    s3:
      mode=list
      bucket=MY_BUCKET
      prefix=MY_PREFIX/
    register: s3_bucket_items

  - name: download s3_bucket_items
    s3:
      mode=get
      bucket=MY_BUCKET
      object={{ item }}
      dest=/tmp/
    with_items: s3_bucket_items.s3_keys

Notes:

  • Your prefix should not have a leading slash.
  • The {{ item }} value will have the prefix already.
like image 198
ThorSummoner Avatar answered Oct 10 '22 21:10

ThorSummoner


You have to first list files to a variable and copy files using that variable.

- name: List files
  aws_s3: 
    aws_access_key: 'YOUR_KEY'
    aws_secret_key: 'YOUR_SECRET'
    mode: list
    bucket: 'YOUR_BUCKET'
    prefix : 'YOUR_BUCKET_FOLDER' #Remember to add trailing slashes
    marker: 'YOUR_BUCKET_FOLDER' #Remember to add trailing slashes
  register: 's3BucketItems'

- name: Copy files
  aws_s3:
    aws_access_key: 'YOUR_KEY'
    aws_secret_key: 'YOUR_SECRET'
    bucket: 'YOUR_BUCKET'
    object: '{{ item }}'
    dest: 'YOUR_DESTINATION_FOLDER/{{ item|basename }}'
    mode: get
  with_items: '{{s3BucketItems.s3_keys}}'
like image 4
Nissanka Avatar answered Oct 10 '22 22:10

Nissanka


The ansible S3 module has currently no built-in way to syncronize buckets to disk recursively.

In theory, you could try to collect the keys to download with a

- name: register keys for syncronization
  s3:     
    mode: list
    bucket: hosts
    object: /data/*
  register: s3_bucket_items

- name: sync s3 bucket to disk
  s3:
    mode=get
    bucket=hosts
    object={{ item }}
    dest=/etc/data/conf/
  with_items: s3_bucket_items.s3_keys

While I often see this solution, it does not seem to work with current ansible/boto versions, due to a bug with nested S3 'directories' (see this bug report for more information), and the ansible S3 module not creating subdirectories for keys. I believe it is also possible that you would run into some memory issues using this method when syncing very large buckets.

I also like to add that you most likely do not want to use credentials coded into your playbooks - I suggest you use IAM EC2 instance profiles instead, which are much more secure and comfortable.

A solution that works for me, would be this:

- name: Sync directory from S3 to disk
  command: "s3cmd sync -q --no-preserve s3://hosts/{{ item }}/ /etc/data/conf/"
  with_items:
    - data
like image 3
M. Glatki Avatar answered Oct 10 '22 21:10

M. Glatki