Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading stream from AWS S3 involves loading entire file

The short version of my scenario is that I am reading a file stored in Amazon's S3 with the .NET SDK...

GetObjectRequest request = new GetObjectRequest
{
    BucketName = this.m_bucketName,
    Key = GetFileKey(fileIdentifier),
};

IAmazonS3 source = ...
GetObjectResponse response = await source.GetObjectAsync(request);
return response.ResponseStream;

I then pass this stream through to MVC as a File result

public async Task<FileResult> Download(...)
{
   return File(GetAwsStream(...), ...);
}

The problem is, apparently S3 is eagerly calculating a checksum of the entire file before returning anything. For large files, this is a significant issue because

  • the web server must download the entire file from AWS S3 before a single byte can start streaming to the client; for large files, it can take minutes before the web server responds to the client
  • it uses a ton of memory on the web server to read the entire stream and calculate the checksum

This entirely defeats the point of a stream. Is there any way to get an actual "stream" from S3?

like image 867
Mark Sowul Avatar asked Sep 11 '25 13:09

Mark Sowul


1 Answers

You can use the HTTP Range header to, in a loop, download specific bytes of an S3 object and then pass those bytes to the client once they are downloaded. That way the web server won't have to wait until the full file has been retrieved to give the client something.

like image 151
jzonthemtn Avatar answered Sep 13 '25 04:09

jzonthemtn