How do you concatenate huge lists without doubling memory?
Consider the following snippet:
Console.WriteLine($"Initial memory size: {Process.GetCurrentProcess().WorkingSet64 /1024 /1024} MB");
int[] a = Enumerable.Range(0, 1000 * 1024 * 1024 / 4).ToArray();
int[] b = Enumerable.Range(0, 1000 * 1024 * 1024 / 4).ToArray();
Console.WriteLine($"Memory size after lists initialization: {Process.GetCurrentProcess().WorkingSet64 / 1024 / 1024} MB");
List<int> concat = new List<int>();
concat.AddRange(a.Skip(500 * 1024 * 1024 / 4));
concat.AddRange(b.Skip(500 * 1024 * 1024 / 4));
Console.WriteLine($"Memory size after lists concatenation: {Process.GetCurrentProcess().WorkingSet64 / 1024 / 1024} MB");
The output is:
Initial memory size: 12 MB
Memory size after lists initialization: 2014 MB
Memory size after lists concatenation: 4039 MB
I would like to keep memory usage to 2014 MB after concatenation, without modifying a and b.
If you need a List<int>
, you can't do this. A List<int>
always contains its data directly, so by the time you've got two arrays with (say) 100 elements, and a list which was created by concatenating those two, you've got 400 independent elements. You can't change that.
What you're looking for is a way of not creating an independent copy of the data. If you're just searching through it (as it sounds like in the comments) you can just use an IEnumerable<int>
created with LINQ:
IEnumerable<int> concat = a.Concat(b);
If you needed something like an IReadOnlyList<T>
or even an IList<T>
, you could implement those interfaces yourself to create an adapter over multiple arrays - but you'd probably need to write that yourself. If you can stick with IEnumerable<T>
, using LINQ will be a lot simpler.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With