Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I directly stream from HttpResponseMessage to file without going through memory?

My program uses HttpClient to send a GET request to a Web API, and this returns a file.

I now use this code (simplified) to store the file to disc:

public async Task<bool> DownloadFile()
{
    var client = new HttpClient();
    var uri = new Uri("http://somedomain.com/path");
    var response = await client.GetAsync(uri);

    if (response.IsSuccessStatusCode)
    {
        var fileName = response.Content.Headers.ContentDisposition.FileName;
        using (var fs = new FileStream(@"C:\test\" + fileName, FileMode.Create, FileAccess.Write, FileShare.None))
        {
            await response.Content.CopyToAsync(fs);
            return true;
        }
    }

    return false;
}

Now, when this code runs, the process loads all of the file into memory. I actually would rather expect the stream gets streamed from the HttpResponseMessage.Content to the FileStream, so that only a small portion of it is held in memory.

We are planning to use that on large files (> 1GB), so is there a way to achieve that without having all of the file in memory?

Ideally without manually looping through reading a portion to a byte[] and writing that portion to the file stream until all of the content is written?

like image 889
Sebastian P.R. Gingter Avatar asked Aug 19 '16 07:08

Sebastian P.R. Gingter


People also ask

What is memory stream?

MemoryStream encapsulates data stored as an unsigned byte array. The encapsulated data is directly accessible in memory. Memory streams can reduce the need for temporary buffers and files in an application. The current position of a stream is the position at which the next read or write operation takes place.

What is a stream in HttpClient?

The Stream class in C# is an abstract class that provides methods to transfer bytes – read from or write to the source. Since we can read from or write to a stream, this enables us to skip creating variables in the middle (for the request body or response content) that can increase memory usage or decrease performance.


1 Answers

It looks like this is by-design - if you check the documentation for HttpClient.GetAsync() you'll see it says:

The returned task object will complete after the whole response (including content) is read

You can instead use HttpClient.GetStreamAsync() which specifically states:

This method does not buffer the stream.

However you don't then get access to the headers in the response as far as I can see. Since that's presumably a requirement (as you're getting the file name from the headers), then you may want to use HttpWebRequest instead which allows you you to get the response details (headers etc.) without reading the whole response into memory. Something like:

public async Task<bool> DownloadFile()
{
    var uri = new Uri("http://somedomain.com/path");
    var request = WebRequest.CreateHttp(uri);
    var response = await request.GetResponseAsync();

    ContentDispositionHeaderValue contentDisposition;
    var fileName = ContentDispositionHeaderValue.TryParse(response.Headers["Content-Disposition"], out contentDisposition)
        ? contentDisposition.FileName
        : "noname.dat";
    using (var fs = new FileStream(@"C:\test\" + fileName, FileMode.Create, FileAccess.Write, FileShare.None))
    {
        await response.GetResponseStream().CopyToAsync(fs);
    }

    return true
}

Note that if the request returns an unsuccessful response code an exception will be thrown, so you may wish to wrap in a try..catch and return false in this case as in your original example.

like image 148
Iridium Avatar answered Sep 19 '22 09:09

Iridium