What's the fastest way to write an S3 object (of which I have the key) to a file? I'm using Java.
You can write a file or data to S3 Using Boto3 using the Object. put() method. Other methods available to write a file to s3 are, Object.
In Amazon S3, folders are used to group objects and organize files. Unlike a traditional file system, Amazon S3 doesn't use hierarchy to organize its objects and files. Amazon S3 console supports the folder concept only as a means of grouping (and displaying) objects.
Use Amazon S3 as a cloud file server with secure access and file sharing. It is a solution with web browser access, mapped drive access, file locking, version control, and sharing files online as web links.
Since Java 7 (published back in July 2011), there’s a better way: Files.copy()
utility from java.util.nio.file
.
Copies all bytes from an input stream to a file.
So you need neither an external library nor rolling your own byte array loops. Two examples below, both of which use the input stream from S3Object.getObjectContent()
.
InputStream in = s3Client.getObject("bucketName", "key").getObjectContent();
1) Write to a new file at specified path:
Files.copy(in, Paths.get("/my/path/file.jpg"));
2) Write to a temp file in system's default tmp location:
File tmp = File.createTempFile("s3test", "");
Files.copy(in, tmp.toPath(), StandardCopyOption.REPLACE_EXISTING);
(Without specifying the option to replace existing file, you'll get a FileAlreadyExistsException
.)
Also note that getObjectContent()
Javadocs urge you to close the input stream:
If you retrieve an S3Object, you should close this input stream as soon as possible, because the object contents aren't buffered in memory and stream directly from Amazon S3. Further, failure to close this stream can cause the request pool to become blocked.
So it should be safest to wrap everything in try-catch-finally, and do in.close();
in the finally block.
The above assumes that you use the official SDK from Amazon (aws-java-sdk-s3
).
While IOUtils.copy()
and IOUtils.copyLarge()
are great, I would prefer the old school way of looping through the inputstream until the inputstream returns -1. Why? I used IOUtils.copy() before but there was a specific use case where if I started downloading a large file from S3 and then for some reason if that thread was interrupted, the download would not stop and it would go on and on until the whole file was downloaded.
Of course, this has nothing to do with S3, just the IOUtils library.
So, I prefer this:
InputStream in = s3Object.getObjectContent();
byte[] buf = new byte[1024];
OutputStream out = new FileOutputStream(file);
while( (count = in.read(buf)) != -1)
{
if( Thread.interrupted() )
{
throw new InterruptedException();
}
out.write(buf, 0, count);
}
out.close();
in.close();
Note: This also means you don't need additional libraries
The AmazonS3Client class has the following method:
S3Object getObject(String bucketName, String key)
The returned S3Object has the method...
java.io.InputStream getObjectContent()
..which gets the object content as a stream. I'd use IOUtils from Apache Commons like this:
IOUtils.copy(s3Object.getObjectContent(), new FileOutputStream(new File(filepath)));
What about this one liner using a TransferManager:
TransferManagerBuilder.defaultTransferManager
.download("bucket-name", "key", new File("."))
From AWS SDK for Java v2 released in 2017, you can just specify a Path
for writing to the file.
s3.getObject(GetObjectRequest.builder().bucket(bucket).key(key).build(),
ResponseTransformer.toFile(Paths.get("multiPartKey")));
https://docs.aws.amazon.com/sdk-for-java/v2/developer-guide/examples-s3-objects.html#download-object
If you need a File
, you can use toFile
method.
Path path = Paths.get("file.txt");
s3.getObject(GetObjectRequest.builder().bucket(bucket).key(key).build(),
path);
File file = path.toFile();
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With