Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find a file in Amazon S3 bucket without knowing the containing folder

My Amazon S3 bucket has a folder structure that looks like the below.

  • bucket-name\00001\file1.txt
  • bucket-name\00001\file2.jpg
  • bucket-name\00002\file3.doc
  • bucket-name\00001\file4.ppt

If I only know file name file3.doc and bucket name bucket-name how can i search for file3.doc in bucket-name. If I knew, it is in folder 00002, I could simply go to the folder and start typing the file name but I have no way to know in which folder the file I am searching for is under.

like image 755
Fred Rogers Avatar asked Oct 01 '15 05:10

Fred Rogers


People also ask

Can you query an S3 bucket?

Amazon S3 Select and Amazon S3 Glacier Select enable customers to run structured query language SQL queries directly on data stored in S3 and Amazon S3 Glacier. With S3 Select, you simply store your data on S3 and query using SQL statements to filter the contents of S3 objects, retrieving only the data that you need.

How can I tell if S3 object is file or directory?

With the method "endsWith("/")" you can detect if the S3ObjectSummary is a folder or not. Hope this can helps someone. works like wonders!

Where are S3 files stored?

For S3 on Outposts, your data is stored in your Outpost on-premises environment, unless you manually choose to transfer it to an AWS Region.


3 Answers

You can easily do this with the AWS CLI.

aws s3 ls s3://BUCKET-NAME/ --recursive | grep FILE-NAME.TXT
like image 143
hfranco Avatar answered Nov 10 '22 14:11

hfranco


Using only the AWS CLI, you can run a list-objects against the bucket with the --query parameter. This will not be a fast operation, as it runs locally after fetching the file list, rather than inside s3's api.

$ aws s3api list-objects --bucket bucket-name --query "Contents[?contains(Key, 'file3')]"

[
    {
        "LastModified": "2017-05-31T20:36:28.000Z",
        "ETag": "\"b861daa5cc3775f38519f5de6566cbe7\"",
        "StorageClass": "STANDARD",
        "Key": "00002/file3.doc",
        "Owner": {
            "DisplayName": "owner",
            "ID": "123"
        },
        "Size": 27032
    }
]

The benefit of using --query over just piping to grep is that you'll get the full response including all available metadata usually included in list-objects, without having to monkey around with before and after arguments for the grep.

See this post on Finding Files in S3 for further information including a similar example which shows the benefit of having metadata, when files of the same name end up in different directories.

like image 33
Luke Waite Avatar answered Nov 10 '22 15:11

Luke Waite


You'll probably need to use a command line tool like s3cmd if you don't know where it is at all:

s3cmd --recursive ls s3://mybucket | grep "file3"

but some limited search is possible:

https://stackoverflow.com/a/21836343/562557

like image 31
thebenedict Avatar answered Nov 10 '22 15:11

thebenedict