Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How would you read S3 as a hierarchical directory structure in Ruby?

Tags:

ruby

amazon-s3

Has anyone had any success reading S3 buckets as subfolders?

folder1

-- subfolder2

---- file3

---- file4

-- file1

-- file2

folder2

-- subfolder3

-- file5

-- file6

My task is to read folder1. I expect to see subfolder2, file1 and file2, but NOT file3 or file4. Right now, because I restrict the bucket keys to prefix => 'folder1/', you still get file3 and 4 since they technically have the folder1 prefix.

It seems the only way to really do this is suck in all the keys under folder1 and then use string searching to actually exclude file3 and file4 from your results array.

Has anyone had experience doing this? I know FTP-style S3 clients like Transmit and Cyberduck must be doing this but it's not apparent from the S3 API itself.

Thanks ahead, Conrad

I've looked into both AWS::S3 and right_aws.

like image 515
chuboy Avatar asked Jan 24 '11 23:01

chuboy


People also ask

Is S3 hierarchical?

In Amazon S3, buckets and objects are the primary resources, and objects are stored in buckets. Amazon S3 has a flat structure instead of a hierarchy like you would see in a file system. However, for the sake of organizational simplicity, the Amazon S3 console supports the folder concept as a means of grouping objects.

What is a hierarchical folder structure?

A folder hierarchy is an organizational structure of one or more folders in Oracle iFS. Folder hierarchies organize the repository so that users can browse through it easily. You can create multiple folder hierarchies to organize information in different ways to make browsing convenient for different types of users.

How is S3 data organized?

All objects are stored in S3 buckets and can be organized with shared names called prefixes. You can also append up to 10 key-value pairs called S3 object tags to each object, which can be created, updated, and deleted throughout an object's lifecycle.

Is S3 folder an object?

Every file that is stored in s3 is considered as an object. Each Amazon S3 object has file content, key (file name with path), and metadata.


1 Answers

The S3 API has no notion of a folder. It does, however, allow for filenames with "/" in them, and it allows you to query with a prefix. You seem to be familiar with that already, but just wanted to be clear.

When you query with a prefix of folder1/, S3 is going to return everything under that "folder". In order to manipulate only direct descendants, you are going to have to filter the results yourself in Ruby (pick your poison: reject or select). This isn't going to help performance (a common reason to use "folders" in S3), but it gets the job done.

like image 77
coreyward Avatar answered Nov 15 '22 20:11

coreyward