Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PermissionError: Forbidden when reading files from aws s3

I am working in python and jupyter notebook, and I am trying to read parquet files from an aws s3bucket, and convert them to a single pandas dataframe.

The bucket and folders are arranged like:

The bucket name: mybucket
   First Folder: 123
      Second Folder: Parquets.parquet
        file1.snappy.parquet
        file2.snappy.parquet
        ....

I am getting the full path with:

bucket = s3.Bucket(name='mybucket')
keys =[]
for key in bucket.objects.all():
  keys.append("s3://mybucket/"+key.key)

And then reading them with:

count = 0
keys = keys[2:]
for obj in bucket.objects.all():
    subsrc = obj.Object()
    key = obj.key 
    path = keys[count]
    obj_df = pd.read_parquet(path)
    df_list.append(obj_df)
    count +=1
    

df = pd.concat(df_list)

But that is giving me:

PermissionError: Forbidden 

pointing to the line 'obj_df = pd.read_parquet(path)' I know I have full s3 access, so that should not be the issue. Thank you so much!


1 Answers

This is probably because the path to the data is incorrect.

(In the code above, you're doing pd.read_parquet(path) where path = keys[count], but I'm pretty sure that that's only the keys, which do not include the bucket name. )

like image 93
Marco Avatar answered Sep 19 '25 05:09

Marco