Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

High Memory issues in .net framework 4 but not in framework 4.5

I have a following piece of code (.net 4) that is consuming a lot of memory:

struct Data
{
    private readonly List<Dictionary<string,string>> _list;

    public Data(List<Dictionary<string,string>> List)
    {
        _list = List;
    }

    public void DoWork()
    {
        int num = 0;
        foreach (Dictionary<string, string> d in _list)
        {
            foreach (KeyValuePair<string, string> kvp in d)
                num += Convert.ToInt32(kvp.Value);
        }

        Console.Write(num);

        //_list = null;
    }
}

class Test1
{
    BlockingCollection<Data> collection = new BlockingCollection<Data>(10);
    Thread th;

    public Test1()
    {
        th = new Thread(Work);
        th.Start();
    }

    public void Read()
    {
        List<Dictionary<string, string>> l = new List<Dictionary<string, string>>();
        Random r = new Random();

        for (int i=0; i<100000; i++)
        {
            Dictionary<string, string> d = new Dictionary<string,string>();
            d["1"]  = r.Next().ToString();
            d["2"]  = r.Next().ToString();
            d["3"]  = r.Next().ToString();
            d["4"]  = r.Next().ToString();

            l.Add(d);
        }

        collection.Add(new Data(l));
    }

    private void Work()
    {
        while (true)
        {
            collection.Take().DoWork();
        }
    }
}

class Program
{
    Test1 t = new Test1();
    static void Main(string[] args)
    {
        Program p = new Program();
        for (int i = 0; i < 1000; i++)
        {
            p.t.Read();
        }
    }
}

The size of blocking collection is 10. In my knowledge, gc should collect references in 'Data' struct as soon its DoWork method is complete. However, the memory keeps on increasing at a rapid rate until the program crashes or it come down on its own and this is happening more often on low end machines (on some machines memory does not increase).Further, when I add the following line "_list = null;" at the end of DoWork method and convert 'Data' into class (from struct), memory does not increase.

What could be happening here. I need some suggestions here.

Update: the issue is occuring on machines with .net framework 4 installed (4.5 not installed)

like image 516
Umer Azaz Avatar asked Jan 16 '13 06:01

Umer Azaz


2 Answers

I've tried on my computer here are the result:

  1. With Data as class and without _list = null at the end of DoWork -> memory increases
  2. With Data as struct and without _list = null at the end of DoWork -> memory increases
  3. With Data as class and with _list = null at the end of DoWork -> memory stabilizes at 150MB
  4. With Data as struct and with _list = null at the end of DoWork -> memory increases

In the cases where _list = null is commented, it is not a surprise to see this result. Because there is still a reference to the _list. Even if DoWork is never called again, the GC can not know it.

In the third case, the garbage collector have the behavior we expect it to have.

For the fourth case, the BlockingCollection stores the Data when you pass it as argument of in collection.Add(new Data(l));, but then what is done?

  1. The a new struct data is created with data._list equals to l (ie as the type List is a class (reference type), data._list equals in the struct Data to the address of l).
  2. Then you pass it as argument in collection.Add(new Data(l)); then it creates a copy of the data created in 1. Then the address of l is copied.
  3. The blocking collection stores your Data elements in an array.
  4. When DoWork executes _list = null, it removes the reference to the problematic List only in the current struct, not in all the copied version that are stored in the BlockingCollection.
  5. Then, you have the problem unless you clear the BlockingCollection.

How to find the problem?

To find memory leak problem, I suggest you to use SOS ( http://msdn.microsoft.com/en-us/library/bb190764.aspx ).

Here, I present how I have found the issue. As it is a issue that imply not only the heap but also the stack, using heap analysis (as here) is not the best way to find the source of the problem.

1 Put a breakpoint on _list = null (because this line should work !!!)

2 Execute the program

3 When the breakpoint is reached, load the SOS Debugging Tool (Write ".load sos" in the Immediate Window)

4 The problem seems to come from the private List> _list that is note disposed correctly. So we'll try to find the instances of the type. Type !DumpHeap -stat -type List in the Immediate Window. Result:

total 0 objects
Statistics:
      MT    Count    TotalSize Class Name
0570ffdc        1           24 System.Collections.Generic.List1[[System.Threading.CancellationTokenRegistration, mscorlib]]
04f63e50        1           24 System.Collections.Generic.List1[[System.Security.Policy.StrongName, mscorlib]]
00202800        2           48 System.Collections.Generic.List1[[System.Collections.Generic.Dictionary2[[System.String, mscorlib],[System.String, mscorlib]], mscorlib]]
Total 4 objects

The problematic type is the last one List<Dictionary<...>>. There are 2 instances and the MethodTable (a kind of reference of the type) is 00202800.

5 To get the references, type !DumpHeap -mt 00202800. Result:

 Address       MT     Size
02618a9c 00202800       24     
0733880c 00202800       24     
total 0 objects
Statistics:
      MT    Count    TotalSize Class Name
00202800        2           48 System.Collections.Generic.List1[[System.Collections.Generic.Dictionary2[[System.String, mscorlib],[System.String, mscorlib]], mscorlib]]
Total 2 objects

The two instances are shown, with their addresses: 02618a9c and 0733880c

6 To find how they are references: Type !GCRoot 02618a9c (for the first instance) or !GCRoot 0733880c (for the second). Result (I have not copied all the result but kept an important part):

ESP:3bef9c:Root:  0261874c(ConsoleApplication1.Test1)->
  0261875c(System.Collections.Concurrent.BlockingCollection1[[ConsoleApplication1.Data, ConsoleApplication1]])->
  02618784(System.Collections.Concurrent.ConcurrentQueue1[[ConsoleApplication1.Data, ConsoleApplication1]])->
  02618798(System.Collections.Concurrent.ConcurrentQueue1+Segment[[ConsoleApplication1.Data, ConsoleApplication1]])->
  026187bc(ConsoleApplication1.Data[])->
  02618a9c(System.Collections.Generic.List1[[System.Collections.Generic.Dictionary2[[System.String, mscorlib],[System.String, mscorlib]], mscorlib]])

for the first instance, and:

Scan Thread 5216 OSTHread 1460
ESP:3bf0b0:Root:  0733880c(System.Collections.Generic.List1[[System.Collections.Generic.Dictionary2[[System.String, mscorlib],[System.String, mscorlib]], mscorlib]])
Scan Thread 4960 OSTHread 1360
Scan Thread 6044 OSTHread 179c

for the second one (when the analyzed object has not deeper root, I think it means it has reference in the stack).

Looking at 026187bc(ConsoleApplication1.Data[]) should be a good way to understand what happen, because we finally see our Data type.

7 To display the content of object, use !DumpObj 026187bc, or in this case, as it is an array, use !DumpArray -details 026187bc. Result (partial):

Name:        ConsoleApplication1.Data[]
MethodTable: 00214f30
EEClass:     00214ea8
Size:        140(0x8c) bytes
Array:       Rank 1, Number of elements 32, Type VALUETYPE
Element Methodtable: 00214670
[0] 026187c4
    Name:        ConsoleApplication1.Data
    MethodTable: 00214670
    EEClass:     00211ac4
    Size:        12(0xc) bytes
    File:        D:\Development Projects\Centive Solutions\SVN\trunk\CentiveSolutions.Renderers\ConsoleApplication1\bin\Debug\ConsoleApplication1.exe
    Fields:
              MT    Field   Offset                 Type VT     Attr    Value Name
        00202800  4000001        0     ...lib]], mscorlib]]      0     instance     02618a9c     _list
[1] 026187c8
    Name:        ConsoleApplication1.Data
    MethodTable: 00214670
    EEClass:     00211ac4
    Size:        12(0xc) bytes
    File:        D:\Development Projects\Centive Solutions\SVN\trunk\CentiveSolutions.Renderers\ConsoleApplication1\bin\Debug\ConsoleApplication1.exe
    Fields:
              MT    Field   Offset                 Type VT     Attr    Value Name
        00202800  4000001        0     ...lib]], mscorlib]]      0     instance     6d50950800000000     _list
[2] 026187cc
    Name:        ConsoleApplication1.Data
    MethodTable: 00214670
    EEClass:     00211ac4
    Size:        12(0xc) bytes
    File:        D:\Development Projects\Centive Solutions\SVN\trunk\CentiveSolutions.Renderers\ConsoleApplication1\bin\Debug\ConsoleApplication1.exe
    Fields:
              MT    Field   Offset                 Type VT     Attr    Value Name
        00202800  4000001        0     ...lib]], mscorlib]]      0     instance     6d50950800000000     _list

Here we have the value of the _list attribute for the 3 first elements of the array: 02618a9c, 6d50950800000000, 6d50950800000000. I suspect 6d50950800000000 to be the "null pointer".

Here we have the answer to your question: There is an array (referenced by the blocking collection (see 6.)) that contains directly the address of the _list we want the garbage collector to finalize.

8 To be sure it is not changing when the line _line = null is executed, executes the line.

Note

As I've mentioned, using DumpHeap is not well suited for the current task implying value types. Why? Because value types are not in the heap but on the stack. Seeing this is very simple: try !DumpHeap -stat -type ConsoleApplication1.Data on the breakpoint. Result:

total 0 objects
Statistics:
      MT    Count    TotalSize Class Name
00214c00        1           20 System.Collections.Concurrent.ConcurrentQueue`1[[ConsoleApplication1.Data, ConsoleApplication1]]
00214e24        1           36 System.Collections.Concurrent.ConcurrentQueue`1+Segment[[ConsoleApplication1.Data, ConsoleApplication1]]
00214920        1           40 System.Collections.Concurrent.BlockingCollection`1[[ConsoleApplication1.Data, ConsoleApplication1]]
00214f30        1          140 ConsoleApplication1.Data[]
Total 4 objects

There is an array of Data but no Data. Because DumpHeap only analyses the heap. Then !DumpArray -details 026187bc, the pointer is still here with the same value. And if you compare the roots of the two instances we have found before (with !GCRoot) before executing the line and after, there will be only line removed. Indeed, the refence to the list has only be removed from 1 copy of the value type Data.

like image 67
Cédric Bignon Avatar answered Oct 20 '22 21:10

Cédric Bignon


If you read Stephen Toub's explanation of how ConcurrentQueue works, the behavior makes sense. BlockingCollection uses ConcurrentQueue by default, which stores its elements in linked lists of 32-element segments.

For the purposes of concurrent access, elements in the linked list are never overwritten, so they don't get unreferenced until the last of a whole segment of 32 is consumed. Since you have a bounded capacity of 10 elements, let's say that you have produced 41 elements and consumed 31. That means you will have one segment of 31 consumed element plus one queued element, and another segment with the remaining 9 elements. At this point all 41 elements are referenced, so if each element is 25MB, your collection will be taking up 1GB! Once the next item is consumed, all 32 of the elements in the head segment will be unreferenced and can be collected.

You may think there should only ever need to be 10 elements in the queue, and that would be the case for a non-concurrent queue, but that would not allow one thread to enumerate the elements in the queue while another thread was producing or consuming elements.

The reason that the .Net 4.5 framework doesn't leak is that they changed the behavior to null out elements as soon as they're produced as long as there is nobody enumerating the queue. If you start enumerating collection, you should see memory leak even with the .Net 4.5 framework.

The reason that setting _list = null works when you have a class is that you are creating a "box" wrapper that allows you to unreference the list in every place that it's used. Setting the value in your local variable changes the same copy that the queue has a reference to.

The reason that setting _list = null doesn't work when you have a struct is that you can only ever change copies of a struct. The "original" version of it sitting in that queue segment is effectively immutable because ConcurrentQueue doesn't provide a way to change it. In other words, you're changing only the copy of the value in your local variable rather than chaging the copy in the queue.

like image 33
Gabe Avatar answered Oct 20 '22 22:10

Gabe