Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# - Downloading from Google Drive in byte chunks

I'm currently developing for an environment that has poor network connectivity. My application helps to automatically download required Google Drive files for users. It works reasonably well for small files (ranging from 40KB to 2MB), but fails far too often for larger files (9MB). I know these file sizes might seem small, but in terms of my client's network environment, Google Drive API constantly fails with the 9MB file.

I've concluded that I need to download files in smaller byte chunks, but I don't see how I can do that with Google Drive API. I've read this over and over again, and I've tried the following code:

// with the Drive File ID, and the appropriate export MIME type, I create the export request
var request = DriveService.Files.Export(fileId, exportMimeType);

// take the message so I can modify it by hand
var message = request.CreateRequest();
var client = request.Service.HttpClient;

// I change the Range headers of both the client, and message
client.DefaultRequestHeaders.Range =
    message.Headers.Range =
    new System.Net.Http.Headers.RangeHeaderValue(100, 200);
var response = await request.Service.HttpClient.SendAsync(message);

// if status code = 200, copy to local file
if (response.IsSuccessStatusCode)
{
    using (var fileStream = new FileStream(downloadFileName, FileMode.CreateNew, FileAccess.ReadWrite))
    {
        await response.Content.CopyToAsync(fileStream);
    }
}

The resultant local file (from fileStream) however, is still full-length (i.e. 40KB file for the 40KB Drive file, and a 500 Internal Server Error for the 9MB file). On a sidenote, I've also experimented with ExportRequest.MediaDownloader.ChunkSize, but from what I observe it only changes the frequency at which the ExportRequest.MediaDownloader.ProgressChanged callback is called (i.e. callback will trigger every 256KB if ChunkSize is set to 256 * 1024).

How can I proceed?

like image 313
matt Avatar asked Jul 19 '16 04:07

matt


1 Answers

You seemed to be heading in the right direction. From your last comment, the request will update progress based on the chunk size, so your observation was accurate.

Looking into the source code for MediaDownloader in the SDK the following was found (emphasis mine)

The core download logic. We download the media and write it to an output stream ChunkSize bytes at a time, raising the ProgressChanged event after each chunk. The chunking behavior is largely a historical artifact: a previous implementation issued multiple web requests, each for ChunkSize bytes. Now we do everything in one request, but the API and client-visible behavior are retained for compatibility.

Your example code will only download one chunk from 100 to 200. Using that approach you would have to keep track of an index and download each chunk manually, copying them to the file stream for each partial download

const int KB = 0x400;
int ChunkSize = 256 * KB; // 256KB;
public async Task ExportFileAsync(string downloadFileName, string fileId, string exportMimeType) {

    var exportRequest = driveService.Files.Export(fileId, exportMimeType);
    var client = exportRequest.Service.HttpClient;

    //you would need to know the file size
    var size = await GetFileSize(fileId);

    using (var file = new FileStream(downloadFileName, FileMode.CreateNew, FileAccess.ReadWrite)) {

        file.SetLength(size);

        var chunks = (size / ChunkSize) + 1;
        for (long index = 0; index < chunks; index++) {

            var request = exportRequest.CreateRequest();

            var from = index * ChunkSize;
            var to = from + ChunkSize - 1;

            request.Headers.Range = new RangeHeaderValue(from, to);

            var response = await client.SendAsync(request);

            if (response.StatusCode == HttpStatusCode.PartialContent || response.IsSuccessStatusCode) {
                using (var stream = await response.Content.ReadAsStreamAsync()) {
                    file.Seek(from, SeekOrigin.Begin);
                    await stream.CopyToAsync(file);
                }
            }
        }
    }
}

private async Task<long> GetFileSize(string fileId) {
    var file = await driveService.Files.Get(fileId).ExecuteAsync();
    var size = file.size;
    return size;
}

This code makes some assumptions about the drive api/server.

  • That the server will allow the multiple requests needed to download the file in chunks. Don't know if requests are throttled.
  • That the server still accepts the Range header like stated in the developer documenation
like image 197
Nkosi Avatar answered Sep 21 '22 08:09

Nkosi