Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to check if file exists in Google Cloud Storage?

I have a script where I want to check if a file exists in a bucket and if it doesn't then create one.

I tried using os.path.exists(file_path) where file_path = "/gs/testbucket", but I got a file not found error.

I know that I can use the files.listdir() API function to list all the files located at a path and then check if the file I want is one of them. But I was wondering whether there is another way to check whether the file exists.

like image 729
Tanvir Shaikh Avatar asked Nov 23 '12 08:11

Tanvir Shaikh


People also ask

How do I find files in cloud storage?

Find files in Google Drive You would expect Google products to come with a good search function, and that's exactly the case with its cloud storage platform. At the top of the Google Drive web interface there is a large search box—results will include file names and documents that contain the words you used.

Where is Google cloud data stored?

The default bucket location is within the US. If you do not specify a location constraint, then your bucket and data added to it are stored on servers in the US.


3 Answers

This post is old, you can actually now check if a file exists on GCP using the blob class, but because it took me a while to find an answer, adding here for the others who are looking for a solution

from google.cloud import storage

name = 'file_i_want_to_check.txt'   
storage_client = storage.Client()
bucket_name = 'my_bucket_name'
bucket = storage_client.bucket(bucket_name)
stats = storage.Blob(bucket=bucket, name=name).exists(storage_client)

Documentation is here

Hope this helps!

Edit

As per the comment by @om-prakash, if the file is in a folder, then the name should include the path to the file:

name = "folder/path_to/file_i_want_to_check.txt"
like image 176
nickthefreak Avatar answered Sep 24 '22 01:09

nickthefreak


It's as easy as use the exists method within a blob object:

from google.cloud import storage

def blob_exists(projectname, credentials, bucket_name, filename):
   client = storage.Client(projectname, credentials=credentials)
   bucket = client.get_bucket(bucket_name)
   blob = bucket.blob(filename)
   return blob.exists()
like image 20
javinievas Avatar answered Sep 27 '22 01:09

javinievas


The answer provided by @nickthefreak is correct, and so is the comment by Om Prakash. One other note is that the bucket_name should not include gs:// in front or a / at the end.

Piggybacking off @nickthefreak's example and Om Prakash's comment:

from google.cloud import storage

name = 'folder1/another_folder/file_i_want_to_check.txt'   

storage_client = storage.Client()
bucket_name = 'my_bucket_name'  # Do not put 'gs://my_bucket_name'
bucket = storage_client.bucket(bucket_name)
stats = storage.Blob(bucket=bucket, name=name).exists(storage_client)

stats will be a Boolean (True or False) depending on whether the file exists in the Storage Bucket.

(I don't have enough reputation points to comment, but I wanted to save other people some time because I wasted way too much time with this).

like image 14
TalkDataToMe Avatar answered Sep 23 '22 01:09

TalkDataToMe