I'm uploading multiple files to Amazon S3 using Java.
The code I'm using is as follows:
MultipartHttpServletRequest multipartRequest = (MultipartHttpServletRequest) request;
MultiValueMap < String,
MultipartFile > map = multipartRequest.getMultiFileMap();
try {
if (map != null) {
for (String filename: map.keySet()) {
List < MultipartFile > fileList = map.get(filename);
incrPercentge = 100 / fileList.size();
request.getSession().setAttribute("incrPercentge", incrPercentge);
for (MultipartFile mpf: fileList) {
/*
* custom input stream wrap to original input stream to get
* the progress
*/
ProgressInputStream inputStream = new ProgressInputStream("test", mpf.getInputStream(), mpf.getBytes().length);
ObjectMetadata metadata = new ObjectMetadata();
metadata.setContentType(mpf.getContentType());
String key = Util.getLoginUserName() + "/" + mpf.getOriginalFilename();
PutObjectRequest putObjectRequest = new PutObjectRequest(
Constants.S3_BUCKET_NAME, key, inputStream, metadata).withStorageClass(StorageClass.ReducedRedundancy);
PutObjectResult response = s3Client.putObject(putObjectRequest);
}
}
}
} catch(Exception e) {
e.printStackTrace();
}
I have to create the custom input stream to get the number byte consumed by Amazon S3. I got that idea from the question here: Upload file or InputStream to S3 with a progress callback
My ProgressInputStream
class code is as follows:
package com.spectralnetworks.net.util;
import java.io.IOException;
import java.io.InputStream;
import org.apache.commons.vfs.FileContent;
import org.apache.commons.vfs.FileSystemException;
public class ProgressInputStream extends InputStream {
private final long size;
private long progress,
lastUpdate = 0;
private final InputStream inputStream;
private final String name;
private boolean closed = false;
public ProgressInputStream(String name, InputStream inputStream, long size) {
this.size = size;
this.inputStream = inputStream;
this.name = name;
}
public ProgressInputStream(String name, FileContent content)
throws FileSystemException {
this.size = content.getSize();
this.name = name;
this.inputStream = content.getInputStream();
}
@Override
public void close() throws IOException {
super.close();
if (closed) throw new IOException("already closed");
closed = true;
}
@Override
public int read() throws IOException {
int count = inputStream.read();
if (count > 0) progress += count;
lastUpdate = maybeUpdateDisplay(name, progress, lastUpdate, size);
return count;
}@Override
public int read(byte[] b, int off, int len) throws IOException {
int count = inputStream.read(b, off, len);
if (count > 0) progress += count;
lastUpdate = maybeUpdateDisplay(name, progress, lastUpdate, size);
return count;
}
/**
* This is on reserach to show a progress bar
* @param name
* @param progress
* @param lastUpdate
* @param size
* @return
*/
static long maybeUpdateDisplay(String name, long progress, long lastUpdate, long size) {
/* if (Config.isInUnitTests()) return lastUpdate;
if (size < B_IN_MB/10) return lastUpdate;
if (progress - lastUpdate > 1024 * 10) {
lastUpdate = progress;
int hashes = (int) (((double)progress / (double)size) * 40);
if (hashes > 40) hashes = 40;
String bar = StringUtils.repeat("#",
hashes);
bar = StringUtils.rightPad(bar, 40);
System.out.format("%s [%s] %.2fMB/%.2fMB\r",
name, bar, progress / B_IN_MB, size / B_IN_MB);
System.out.flush();
}*/
System.out.println("name " + name + " progress " + progress + " lastUpdate " + lastUpdate + " " + "sie " + size);
return lastUpdate;
}
}
But this is not working properly. It is printing immediately up to the file size as follows:
name test progress 4096 lastUpdate 0 sie 30489
name test progress 8192 lastUpdate 0 sie 30489
name test progress 12288 lastUpdate 0 sie 30489
name test progress 16384 lastUpdate 0 sie 30489
name test progress 20480 lastUpdate 0 sie 30489
name test progress 24576 lastUpdate 0 sie 30489
name test progress 28672 lastUpdate 0 sie 30489
name test progress 30489 lastUpdate 0 sie 30489
name test progress 30489 lastUpdate 0 sie 30489
And the actual uploading is taking more time (more than 10 times after printing the lines).
What I should do so that I can get a true upload status?
Verify the integrity of the uploaded object When you use PutObject to upload objects to Amazon S3, pass the Content-MD5 value as a request header. Amazon S3 checks the object against the provided Content-MD5 value. If the values do not match, you receive an error.
In AWS Explorer, expand the Amazon S3 node, and double-click a bucket or open the context (right-click) menu for the bucket and choose Browse. In the Browse view of your bucket, choose Upload File or Upload Folder. In the File-Open dialog box, navigate to the files to upload, choose them, and then choose Open.
Upload speed to AWS S3 tops out at 2.3 Mbps. Tried the multipart upload but even with 10 concurrent threads, the total speed remains the same, just gets split between all threads ~20 KB/s.
I got the answer of my questions the best way get the true progress status by using below code
ObjectMetadata metadata = new ObjectMetadata();
metadata.setContentType(mpf.getContentType());
String key = Util.getLoginUserName() + "/"
+ mpf.getOriginalFilename();
metadata.setContentLength(mpf.getSize());
PutObjectRequest putObjectRequest = new PutObjectRequest(
Constants.S3_BUCKET_NAME, key, mpf.getInputStream(),
metadata)
.withStorageClass(StorageClass.ReducedRedundancy);
putObjectRequest.setProgressListener(new ProgressListener() {
@Override
public void progressChanged(ProgressEvent progressEvent) {
System.out.println(progressEvent
.getBytesTransfered()
+ ">> Number of byte transfered "
+ new Date());
progressEvent.getBytesTransfered();
double totalByteRead = request
.getSession().getAttribute(
Constants.TOTAL_BYTE_READ) != null ? (Double) request
.getSession().getAttribute(Constants.TOTAL_BYTE_READ) : 0;
totalByteRead += progressEvent.getBytesTransfered();
request.getSession().setAttribute(Constants.TOTAL_BYTE_READ, totalByteRead);
System.out.println("total Byte read "+ totalByteRead);
request.getSession().setAttribute(Constants.TOTAL_PROGRESS, (totalByteRead/size)*100);
System.out.println("percentage completed >>>"+ (totalByteRead/size)*100);
if (progressEvent.getEventCode() == ProgressEvent.COMPLETED_EVENT_CODE) {
System.out.println("completed ******");
}
}
});
s3Client.putObject(putObjectRequest);
The problem with my previous code was , I was not setting the content length in meta data so i was not getting the true progress status. The below line is copy from PutObjectRequest class API
Constructs a new PutObjectRequest object to upload a stream of data to the specified bucket and key. After constructing the request, users may optionally specify object metadata or a canned ACL as well.
Content length for the data stream must be specified in the object metadata parameter; Amazon S3 requires it be passed in before the data is uploaded. Failure to specify a content length will cause the entire contents of the input stream to be buffered locally in memory so that the content length can be calculated, which can result in negative performance problems.
I going to assume you are using the AWS SDK for Java.
Your code is working as it should: It shows read is being called with 4K being read each time. Your idea (updated in the message) is also correct: The AWS SDK provides ProgressListener as a way to inform the application of progress in the upload.
The "problem" is in the implementation of the AWS SDK it is buffering more than the ~30K size of your file (I'm going to assume it's 64K) so you're not getting any progress reports.
Try to upload a bigger file (say 1M) and you'll see both methods give you better results, after all with today's network speeds reporting the progress on a 30K file is not even worth it.
If you want better control you could implement the upload yourself using the S3 REST interface (which is what the AWS Java SDK ultimately uses) it is not very difficult, but it is a bit of work. If you want to go this route I recommend finding an example for computing the session authorization token instead of doing it yourself (sorry my search foo is not strong enough for a link to actual sample code right now.) However once you go to all that trouble you'll find that you actually want to have a 64K buffer on the socket stream to ensure maximum throughput in a fast network (which is probably why the AWS Java SDK behaves as it does.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With