Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Put GC on hold during a section of code

Is there a way to put the GC on hold completely for a section of code? The only thing I've found in other similar questions is GC.TryStartNoGCRegion but it is limited to the amount of memory you specify which itself is limited to the size of an ephemeral segment.

Is there a way to bypass that completely and tell .NET "allocate whatever you need, don't do GC period" or to increase the size of segments? From what I found it is at most 1GB on a many core server and this is way less than what I need to allocate yet I don't want GC to happen (I have up to terabytes of free RAM and there are thousands of GC spikes during that section, I'd be more than happy to trade those for 10 or even 100 times the RAM usage).

Edit:

Now that there's a bounty I think it's easier if I specify the use case. I'm loading and parsing a very large XML file (1GB for now, 12GB soon) into objects in memory using LINQ to XML. I'm not looking for an alternative to that. I'm creating millions of small objects from millions of XElements and the GC is trying to collect non-stop while I'd be very happy keeping all that RAM used up. I have 100s of GBs of RAM and as soon as it hits 4GB used, the GC starts collecting non-stop which is very memory friendly but performance unfriendly. I don't care about memory but I do care about performance. I want to take the opposite trade-off.

While i can't post the actual code here is some sample code that is very close to the end code that may help those who asked for more information :

var items = XElement.Load("myfile.xml")
.Element("a")
.Elements("b") // There are about 2 to 5 million instances of "b"
.Select(pt => new
{
    aa = pt.Element("aa"),
    ab = pt.Element("ab"),
    ac = pt.Element("ac"),
    ad = pt.Element("ad"),
    ae = pt.Element("ae")
})
.Select(pt => new 
{
    aa = new
    {
        aaa = double.Parse(pt.aa.Attribute("aaa").Value),
        aab = double.Parse(pt.aa.Attribute("aab").Value),
        aac = double.Parse(pt.aa.Attribute("aac").Value),
        aad = double.Parse(pt.aa.Attribute("aad").Value),
        aae = double.Parse(pt.aa.Attribute("aae").Value)
    },
    ab = new
    {
        aba = double.Parse(pt.aa.Attribute("aba").Value),
        abb = double.Parse(pt.aa.Attribute("abb").Value),
        abc = double.Parse(pt.aa.Attribute("abc").Value),
        abd = double.Parse(pt.aa.Attribute("abd").Value),
        abe = double.Parse(pt.aa.Attribute("abe").Value)
    },
    ac = new
    {
        aca = double.Parse(pt.aa.Attribute("aca").Value),
        acb = double.Parse(pt.aa.Attribute("acb").Value),
        acc = double.Parse(pt.aa.Attribute("acc").Value),
        acd = double.Parse(pt.aa.Attribute("acd").Value),
        ace = double.Parse(pt.aa.Attribute("ace").Value),
        acf = double.Parse(pt.aa.Attribute("acf").Value),
        acg = double.Parse(pt.aa.Attribute("acg").Value),
        ach = double.Parse(pt.aa.Attribute("ach").Value)
    },
    ad1 = int.Parse(pt.ad.Attribute("ad1").Value),
    ad2 = int.Parse(pt.ad.Attribute("ad2").Value),
    ae = new double[]
    {
        double.Parse(pt.ae.Attribute("ae1").Value),
        double.Parse(pt.ae.Attribute("ae2").Value),
        double.Parse(pt.ae.Attribute("ae3").Value),
        double.Parse(pt.ae.Attribute("ae4").Value),
        double.Parse(pt.ae.Attribute("ae5").Value),
        double.Parse(pt.ae.Attribute("ae6").Value),
        double.Parse(pt.ae.Attribute("ae7").Value),
        double.Parse(pt.ae.Attribute("ae8").Value),
        double.Parse(pt.ae.Attribute("ae9").Value),
        double.Parse(pt.ae.Attribute("ae10").Value),
        double.Parse(pt.ae.Attribute("ae11").Value),
        double.Parse(pt.ae.Attribute("ae12").Value),
        double.Parse(pt.ae.Attribute("ae13").Value),
        double.Parse(pt.ae.Attribute("ae14").Value),
        double.Parse(pt.ae.Attribute("ae15").Value),
        double.Parse(pt.ae.Attribute("ae16").Value),
        double.Parse(pt.ae.Attribute("ae17").Value),
        double.Parse(pt.ae.Attribute("ae18").Value),
        double.Parse(pt.ae.Attribute("ae19").Value)
    }
})
.ToArray();
like image 742
Ronan Thibaudau Avatar asked May 16 '16 20:05

Ronan Thibaudau


3 Answers

I think the best solution in your case would be this piece of code I used in one of my projects some times ago

var currentLatencySettings = GCSettings.LatencyMode;   
GCSettings.LatencyMode = GCLatencyMode.LowLatency;

//your operations

GCSettings.LatencyMode = currentLatencySettings;

You are surpressing as much as you can (according to my knowledge) and you can still call GC.Collect() manually.

Look at the MSDN article here

Also, I would strongly suggest paging the parsed collection using LINQ Skip() and Take() methods. And finally joining the output arrays

like image 116
Peuczynski Avatar answered Oct 10 '22 14:10

Peuczynski


Currently the best i could find was switching to server GC (which changed nothing by itself) that has larger segment size and let me use a much larger number for no gc section :

        GC.TryStartNoGCRegion(10000000000); // On Workstation GC this crashed with a much lower number, on server GC this works

It goes against my expectations (this is 10GB, yet from what i could find in the doc online my segment size in my current setup should be 1 to 4GB so i expected an invalid argument).

With this setup i have what i wanted (GC is on hold, i have 22GB allocated instead of 7, all the temporary objects aren't GCed, but the GC runs once (a single time!) over the whole batch process instead of many many times per second (before the change the GC view in visual studio looked like a straight line from all the individual dots of GC triggering).

This isn't great as it won't scale (adding a 0 leads to a crash) but it's better than anything else i found so far.

Unless anyone finds out how to increase the segment size so that i can push this further or has a better alternative to completely halt the GC (and not just a certain generation but all of it) i will accept my own answer in a few days.

like image 45
Ronan Thibaudau Avatar answered Oct 10 '22 14:10

Ronan Thibaudau


I am not sure whether its possible in your case, however have you tried processing your XML file in parallel. If you can break down your XML file in smaller parts, you can spawn multiple processes from within your code. Each process handling a separate file. You can then combine all the results. This would certainly increase your performance and also with each process separately you will have its separate allocation of memory, which should also increase your memory allocation at a particular time while processing all the XML files.

like image 1
DotNetDev Avatar answered Oct 10 '22 14:10

DotNetDev