I'm trying to download approx. 45.000 image files from an API. The image files have less than 50kb each. With my code this will take 2-3 Hours.
Is there an more efficient way in C# to download them?
private static readonly string baseUrl =
"http://url.com/Handlers/Image.ashx?imageid={0}&type=image";
internal static void DownloadAllMissingPictures(List<ListObject> ImagesToDownload,
string imageFolderPath)
{
Parallel.ForEach(Partitioner.Create(0, ImagesToDownload.Count), range =>
{
for (var i = range.Item1; i < range.Item2; i++)
{
string ImageID = ImagesToDownload[i].ImageId;
using (var webClient = new WebClient())
{
string url = String.Format(baseUrl, ImageID);
string file = String.Format(@"{0}\{1}.jpg", imageFolderPath,
ImagesToDownload[i].ImageId);
byte[] data = webClient.DownloadData(url);
using (MemoryStream mem = new MemoryStream(data))
{
using (var image = Image.FromStream(mem))
{
image.Save(file, ImageFormat.Jpeg);
}
}
}
}
});
}
I tested some variations of your suggestions. The Code by Theodor Zoulias was my favourite.
It works fine and fast with approx 1.200 downloads per Minute.
This is the final Code I'm using now:
private static readonly string _baseUrlPattern = "http://url.com/Handlers/Image.ashx?imageId={0}&type=card";
private static readonly HttpClient _httpClient = new HttpClient();
internal static void DownloadAllMissingPictures(CancellationToken cancellationToken = default)
{
ServicePointManager.DefaultConnectionLimit = 8;
var parallelOptions = new ParallelOptions()
{
MaxDegreeOfParallelism = 10,
CancellationToken = cancellationToken,
};
Parallel.ForEachAsync(ListWithImagesToDownload, parallelOptions, async (image, ct) =>
{
string imageId = image.identifiers.ImageId;
string url = String.Format(_baseUrlPattern, imageId);
string filePath = Path.Combine(imageFolderPath, $"{imageId}.jpg");
using HttpResponseMessage response = await _httpClient.GetAsync(url, ct);
response.EnsureSuccessStatusCode();
using FileStream fileStream = File.OpenWrite(filePath);
await response.Content.CopyToAsync(fileStream);
}).Wait();
}
The Code Idea by TomTom is fine, but stops after one loop. So I can't tell you which impact the MaxConnectionsPerServer has on the Download speed.
I'm sorry I can't share some experience with you too. But as I said, I'm still a beginner with less than one year of programming experience.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With