Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# WebClient - Large increase of LOH after downloading files

I have a class responsible for downloading files in a download manager. This class is responsible for downloading the file and writing it to the given path.

The size of the files to download varies normally from 1 to 5 MB but could also be much larger. I'm using an instance of the WebClient class to get the file from the internet.

public class DownloadItem
{
    #region Events
    public delegate void DownloadItemDownloadCompletedEventHandler(object sender, DownloadCompletedEventArgs args);

    public event DownloadItemDownloadCompletedEventHandler DownloadItemDownloadCompleted;

    protected virtual void OnDownloadItemDownloadCompleted(DownloadCompletedEventArgs e)
    {
        DownloadItemDownloadCompleted?.Invoke(this, e);
    }

    public delegate void DownloadItemDownloadProgressChangedEventHandler(object sender, DownloadProgressChangedEventArgs args);

    public event DownloadItemDownloadProgressChangedEventHandler DownloadItemDownloadProgressChanged;

    protected virtual void OnDownloadItemDownloadProgressChanged(DownloadProgressChangedEventArgs e)
    {
        DownloadItemDownloadProgressChanged?.Invoke(this, e);
    }
    #endregion

    #region Fields
    private static readonly Logger Logger = LogManager.GetCurrentClassLogger();
    private WebClient _client;
    #endregion

    #region Properties
    public PlaylistItem Item { get; }
    public string SavePath { get; }
    public bool Overwrite { get; }
    #endregion

    public DownloadItem(PlaylistItem item, string savePath, bool overwrite = false)
    {
        Item = item;
        SavePath = savePath;
        Overwrite = overwrite;
    }

    public void StartDownload()
    {
        if (File.Exists(SavePath) && !Overwrite)
        {
            OnDownloadItemDownloadCompleted(new DownloadCompletedEventArgs(true));
            return;
        }

        OnDownloadItemDownloadProgressChanged(new DownloadProgressChangedEventArgs(1));
        Item.RetreiveDownloadUrl();

        if (string.IsNullOrEmpty(Item.DownloadUrl))
        {
            OnDownloadItemDownloadCompleted(new DownloadCompletedEventArgs(true, new InvalidOperationException("Could not retreive download url")));
            return;
        }

        // GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
        using (_client = new WebClient())
        {
            _client.Headers.Add("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)");

            try
            {
                _client.DownloadDataCompleted +=
                    (sender, args) =>
                    {
                        Task.Run(() =>
                        {
                            DownloadCompleted(args);
                        });
                    };
                _client.DownloadProgressChanged += (sender, args) => OnDownloadItemDownloadProgressChanged(new DownloadProgressChangedEventArgs(args.ProgressPercentage));
                _client.DownloadDataAsync(new Uri(Item.DownloadUrl));
            }
            catch (Exception ex)
            {
                Logger.Warn(ex, "Error downloading track {0}", Item.VideoId);
                OnDownloadItemDownloadCompleted(new DownloadCompletedEventArgs(true, ex));
            }
        }
    }

    private void DownloadCompleted(DownloadDataCompletedEventArgs args)
    {
        // _client = null;

        // GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
        // GC.Collect(2, GCCollectionMode.Forced);

        if (args.Cancelled)
        {
            OnDownloadItemDownloadCompleted(new DownloadCompletedEventArgs(true, args.Error));
            return;
        }

        try
        {
            File.WriteAllBytes(SavePath, args.Result);

            using (var file = TagLib.File.Create(SavePath))
            {
                file.Save();
            }

            try
            {
                MusicFormatConverter.M4AToMp3(SavePath);
            }
            catch (Exception)
            {
                // ignored
            }

            OnDownloadItemDownloadCompleted(new DownloadCompletedEventArgs(false));
        }
        catch (Exception ex)
        {
            OnDownloadItemDownloadCompleted(new DownloadCompletedEventArgs(true, ex));
            Logger.Error(ex, "Error writing track file for track {0}", Item.VideoId);
        }
    }

    public void StopDownload()
    {
        _client?.CancelAsync();
    }

    public override int GetHashCode()
    {
        return Item.GetHashCode();
    }

    public override bool Equals(object obj)
    {
        var item = obj as DownloadItem;

        return Item.Equals(item?.Item);
    }
}

Every download causes a very large memory increase compared with the file size of the downloaded item. If I download a file with a size of ~3 MB the memory usage is increasing about 8 MB.

As you can see the download is producing much LOH which is not cleared after the download. Even forcing the GC or the setting GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce; is not helping to prevent this memory leak.

Comparing Snapshot 1 and 2 you can see that the amount of memory is produced by byte arrays which might be the download result.

Doing several downloads shows how terrible this memory leak is.

In my opinion this is caused by the WebClient instance in any way. However I can't really determine what exactly is causing this issue. It doesn't even matters if I force the GC. This screen here shows it without forced gc:

What is causing this overheat and how can I fix it? This is a major bug and imagining 100 or more downloads the process would run out of memory.

Edit


As suggested I commented out the section responsible for setting the tags and converting the M4A to an MP3. However the converter is just a call of FFMPEG so it shouldn't be a memory leak:

class MusicFormatConverter
{
    public static void M4AToMp3(string filePath, bool deleteOriginal = true)
    {
        if(string.IsNullOrEmpty(filePath) || !filePath.EndsWith(".m4a"))
            throw new ArgumentException(nameof(filePath));

        var toolPath = Path.Combine("tools", "ffmpeg.exe");

        var convertedFilePath = filePath.Replace(".m4a", ".mp3");
        File.Delete(convertedFilePath);

        var process = new Process
        {
            StartInfo =
            {
                FileName = toolPath,
#if !DEBUG
                WindowStyle = ProcessWindowStyle.Hidden,
#endif
                Arguments = $"-i \"{filePath}\" -acodec libmp3lame -ab 128k \"{convertedFilePath}\""
            }
        };

        process.Start();
        process.WaitForExit();

        if(!File.Exists(convertedFilePath))
            throw new InvalidOperationException("File was not converted successfully!");

        if(deleteOriginal)
            File.Delete(filePath);
    }
}

The DownloadCompleted() method looks now like this:

private void DownloadCompleted(DownloadDataCompletedEventArgs args)
{
    // _client = null;

    // GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
    // GC.Collect(2, GCCollectionMode.Forced);

    if (args.Cancelled)
    {
        OnDownloadItemDownloadCompleted(new DownloadCompletedEventArgs(true, args.Error));
        return;
    }

    try
    {
        File.WriteAllBytes(SavePath, args.Result);

        /*
        using (var file = TagLib.File.Create(SavePath))
        {
            file.Save();
        }

        try
        {
            MusicFormatConverter.M4AToMp3(SavePath);
        }
        catch (Exception)
        {
            // ignore
        }
        */

        OnDownloadItemDownloadCompleted(new DownloadCompletedEventArgs(false));
    }
    catch (Exception ex)
    {
        OnDownloadItemDownloadCompleted(new DownloadCompletedEventArgs(true, ex));
        Logger.Error(ex, "Error writing track file for track {0}", Item.VideoId);
    }
}

The result after downloading 7 items: It seems like this was not the memory leak.

As an addition I'm submitting the DownloadManager class too as it is handling the whole download operation. Maybe this could be the source of the problem.

public class DownloadManager
{
    #region Fields
    private static readonly Logger Logger = LogManager.GetCurrentClassLogger();
    private readonly Queue<DownloadItem> _queue;
    private readonly List<DownloadItem> _activeDownloads;
    private bool _active;
    private Thread _thread;
    #endregion

    #region Construction
    public DownloadManager()
    {
        _queue = new Queue<DownloadItem>();
        _activeDownloads = new List<DownloadItem>();
    }
    #endregion

    #region Methods
    public void AddToQueue(DownloadItem item)
    {
        _queue.Enqueue(item);

        StartManager();
    }

    public void Abort()
    {
        _thread?.Abort();

        _queue.Clear();
        _activeDownloads.Clear();
    }

    private void StartManager()
    {
        if(_active) return;

        _active = true;

        _thread = new Thread(() =>
        {
            try
            {
                while (_queue.Count > 0 && _queue.Peek() != null)
                {
                    DownloadItem();

                    while (_activeDownloads.Count >= Properties.Settings.Default.ParallelDownloads)
                    {
                        Thread.Sleep(10);
                    }
                }

                _active = false;
            }
            catch (ThreadInterruptedException)
            {
                // ignored
            }
        });
        _thread.Start();
    }

    private void DownloadItem()
    {
        if (_activeDownloads.Count >= Properties.Settings.Default.ParallelDownloads) return;

        DownloadItem item;
        try
        {
            item = _queue.Dequeue();
        }
        catch
        {
            return;
        }

        if (item != null)
        {
            item.DownloadItemDownloadCompleted += (sender, args) =>
            {
                if(args.Error != null)
                    Logger.Error(args.Error, "Error downloading track {0}", ((DownloadItem)sender).Item.VideoId);

                _activeDownloads.Remove((DownloadItem) sender);
            };

            _activeDownloads.Add(item);
            Task.Run(() => item.StartDownload());
        }
    }
    #endregion
like image 274
Christian Klemm Avatar asked Oct 12 '16 11:10

Christian Klemm


1 Answers

Finally, after dozens of profilings and memory checking the issue is resolved now.

As @SimonMourier already stated this issue is related to the design of the UploadFile, DownloadData, DownloadString and DownloadFile methods. Looking into the backend of them you can see that all of them are using the private DownloadBits method in the WebClient class with this signature:

private byte[] DownloadBits(WebRequest request, Stream writeStream, CompletionDelegate completionDelegate, AsyncOperation asyncOp)

Regarding the return type it is clear why the behaviour is like I discovered: When using the above mentioned methods the content is saved in a byte array. Therefore it is not recommended to use these methods if the file size is > 85,000 bytes as this would result in filling the LOH until the memory limit is reached. This might not be important if the files are small but with growing size the LOH is also growing by a multiple.

As an addition here my final solution:

public class DownloadItem : DownloadManagerItem
{
    #region Fields

    private static readonly Logger Logger = LogManager.GetCurrentClassLogger();

    private WebClient _webClient;

    #endregion

    #region Properties

    public string SavePath { get; }
    public bool Overwrite { get; }
    public DownloadFormat DownloadFormat { get; }

    #endregion

    public DownloadItem(PlaylistItem item, string savePath, DownloadFormat downloadFormat, bool overwrite = false)
        : base(item)
    {
        SavePath = savePath;
        Overwrite = overwrite;
        DownloadFormat = downloadFormat;
    }

    public override void StartDownload()
    {
        if (File.Exists(SavePath) && !Overwrite)
        {
            OnDownloadItemDownloadCompleted(new DownloadCompletedEventArgs(true));
            return;
        }

        OnDownloadItemDownloadProgressChanged(new DownloadProgressChangedEventArgs(1));
        Item.RetreiveDownloadUrl();

        if (string.IsNullOrEmpty(Item.DownloadUrl))
        {
            OnDownloadItemDownloadCompleted(new DownloadCompletedEventArgs(true,
                new InvalidOperationException("Could not retreive download url")));
            return;
        }

        using (_webClient = new WebClient())
        {
            _webClient.Headers.Add("user-agent",
                "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)");

            try
            {
                _webClient.OpenReadCompleted += WebClientOnOpenReadCompleted;

                _webClient.OpenReadAsync(new Uri(Item.DownloadUrl));
            }
            catch (Exception ex)
            {
                Logger.Warn(ex, "Error downloading track {0}", Item.VideoId);
                OnDownloadItemDownloadCompleted(new DownloadCompletedEventArgs(true, ex));
            }
        }
    }

    private void WebClientOnOpenReadCompleted(object sender, OpenReadCompletedEventArgs openReadCompletedEventArgs)
    {
        _webClient.Dispose();

        if (openReadCompletedEventArgs.Cancelled)
        {
            OnDownloadItemDownloadCompleted(new DownloadCompletedEventArgs(true, openReadCompletedEventArgs.Error));
            return;
        }

        if (!Overwrite && File.Exists(SavePath))
            return;

        var totalLength = 0;
        try
        {
            totalLength = int.Parse(((WebClient)sender).ResponseHeaders["Content-Length"]);
        }
        catch (Exception)
        {
            // ignored
        }

        try
        {
            long processed = 0;
            var tmpPath = Path.GetTempFileName();

            using (var stream = openReadCompletedEventArgs.Result)
            using (var fs = File.Create(tmpPath))
            {
                var buffer = new byte[16 * 1024];
                int read;

                while ((read = stream.Read(buffer, 0, buffer.Length)) > 0)
                {
                    fs.Write(buffer, 0, read);

                    processed += read;
                    OnDownloadItemDownloadProgressChanged(new DownloadProgressChangedEventArgs(processed, totalLength));
                }
            }

            File.Move(tmpPath, SavePath);

            OnDownloadItemDownloadCompleted(new DownloadCompletedEventArgs(false));
        }
        catch (Exception ex)
        {
            OnDownloadItemDownloadCompleted(new DownloadCompletedEventArgs(true, ex));
        }
    }

    public override void StopDownload()
    {
        _webClient?.CancelAsync();
    }

    public override void Dispose()
    {
        _webClient?.Dispose();
    }

    public override int GetHashCode()
    {
        return Item.GetHashCode();
    }

    public override bool Equals(object obj)
    {
        var item = obj as DownloadItem;

        return Item.Equals(item?.Item);
    }
}

However thanks for the help!

like image 142
Christian Klemm Avatar answered Oct 27 '22 15:10

Christian Klemm