Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to read image file from S3 bucket directly into memory?

I have the following code

import matplotlib.pyplot as plt import matplotlib.image as mpimg import numpy as np import boto3 s3 = boto3.resource('s3', region_name='us-east-2') bucket = s3.Bucket('sentinel-s2-l1c') object = bucket.Object('tiles/10/S/DG/2015/12/7/0/B01.jp2') object.download_file('B01.jp2') img=mpimg.imread('B01.jp2') imgplot = plt.imshow(img) plt.show(imgplot) 

and it works. But the problem it downloads file into current directory first. Is it possible to read file and decode it as image directly in RAM?

like image 822
Dims Avatar asked May 18 '17 08:05

Dims


People also ask

Can I read S3 file without downloading?

Reading objects without downloading them Similarly, if you want to upload and read small pieces of textual data such as quotes, tweets, or news articles, you can do that using the S3 resource method put(), as demonstrated in the example below (Gist).

How do I extract files from S3 bucket?

In the Amazon S3 console, choose your S3 bucket, choose the file that you want to open or download, choose Actions, and then choose Open or Download. If you are downloading an object, specify where you want to save it.


2 Answers

I would suggest using io module to read the file directly in to memory, without having to use a temporary file at all.

For example:

import matplotlib.pyplot as plt import matplotlib.image as mpimg import numpy as np import boto3 import io  s3 = boto3.resource('s3', region_name='us-east-2') bucket = s3.Bucket('sentinel-s2-l1c') object = bucket.Object('tiles/10/S/DG/2015/12/7/0/B01.jp2')  file_stream = io.StringIO() object.download_fileobj(file_stream) img = mpimg.imread(file_stream) # whatever you need to do 

You could also use io.BytesIO if your data is binary.

like image 50
Greg Merritt Avatar answered Sep 28 '22 01:09

Greg Merritt


Further development from Greg Merritt's answer to solve all errors in the comment section, using BytesIO instead of StringIO, using PIL Image instead of matplotlib.image.

The following function works for python3 and boto3. Similarly, write_image_to_s3 function is a bonus.

from PIL import Image from io import BytesIO import numpy as np  def read_image_from_s3(bucket, key, region_name='ap-southeast-1'):     """Load image file from s3.      Parameters     ----------     bucket: string         Bucket name     key : string         Path in s3      Returns     -------     np array         Image array     """     s3 = boto3.resource('s3', region_name='ap-southeast-1')     bucket = s3.Bucket(bucket)     object = bucket.Object(key)     response = object.get()     file_stream = response['Body']     im = Image.open(file_stream)     return np.array(im)  def write_image_to_s3(img_array, bucket, key, region_name='ap-southeast-1'):     """Write an image array into S3 bucket      Parameters     ----------     bucket: string         Bucket name     key : string         Path in s3      Returns     -------     None     """     s3 = boto3.resource('s3', region_name)     bucket = s3.Bucket(bucket)     object = bucket.Object(key)     file_stream = BytesIO()     im = Image.fromarray(img_array)     im.save(file_stream, format='jpeg')     object.put(Body=file_stream.getvalue()) 
like image 32
beahacker Avatar answered Sep 28 '22 00:09

beahacker