Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LINQ Performance with SelectMany, GroupBy and First

Tags:

c#

linq

Today I tried to solve https://projecteuler.net/problem=5 with LINQ.

this works fine and executes in under 2sec on my machine, but is a little verbose:

Enumerable.Range(1, 1000000000)
    .Where(i => i % 2 == 0
        && i % 3 == 0
        && i % 4 == 0
        && i % 5 == 0
        && i % 6 == 0
        && i % 7 == 0
        && i % 8 == 0
        && i % 9 == 0
        && i % 10 == 0
        && i % 11 == 0
        && i % 12 == 0
        && i % 13 == 0
        && i % 14 == 0
        && i % 15 == 0
        && i % 16 == 0
        && i % 17 == 0
        && i % 18 == 0
        && i % 19 == 0
        && i % 20 == 0
    )
    .First()

so I tried to put the 2-19 range into a Enumerable too and do a crossjoin like so

Enumerable.Range(1, 1000000000)
    .SelectMany(n =>
        Enumerable.Range(2, 19)
            .Select(d => (n, d))
    )
    .GroupBy(x => x.n)
    .Where(g => g.All(y => y.n % y.d == 0))
    .First()
    .Key

the issue with the second solution is that it allocates heavily, crashes in x86 LINQPad with an OutOfMemoryException, and eats up a whole lot of mem in the x64 LINQPad version before I kill it manually.

My question is why? And is there a LINQ query that can avoid that issue?

The CLR Heap Allocation Analyzer plugin tells me that there is a heap allocation going on inside

.Select(d => (n, d))

due to the capturing of 'n'. So I assume that is the reason for the OutOfMemoryException, but ... since I use First() without materializing the query in between, I assumed that this should not be a problem, because linq would materialize the group and discard it again because it does not satisfy the condition while releasing the memory. is there something funky going on with selectmany or groupby that forces all data to be materialized first, or is my mental model just wrong here?

like image 609
Egi Avatar asked Oct 26 '25 10:10

Egi


1 Answers

If you tried the following code, the same problem will occur:

Enumerable.Range(1, 1000000000)
    .GroupBy(x => x)
    .First();

This means that all groups will be materialized during execution of the query, and that's the reason why OutOfMemoryException is thrown.

To solve the problem, you can use the following LINQ code as mentioned by @haim770 in the comments section:

Enumerable
    .Range(1, 1000000000)
    .First(i => Enumerable
        .Range(2, 19)
        .All(j => i % j == 0))

For more optimization, I found a better solution. Sorry for not using LINQ but it's much more optimized, maybe there's a way to achieve it using LINQ.

Instead of looping through a large amount of numbers and check each one, why not building the desired answer directly.

The desired output is a number x that is divisible by all numbers 1..n without any remainder. So, it is the multiple of all prime factors of numbers in range 1..n. by taking minimal amount of these prime factors, we can get the smallest number x.

For example, if n = 10 then:

i = 2: prime factors of 2 are [2] -> neededPrimes = [2]
i = 3: prime factors of 3 are [3] -> neededPrimes = [2, 3]
i = 4: prime factors of 4 are [2, 2] -> neededPrimes = [2, 2, 3] // we add just one 2 because the other already exists in neededPrimes
i = 5: prime factors of 5 are [5] -> neededPrimes = [2, 2, 3, 5]
i = 6: prime factors of 6 are [2, 3] -> neededPrimes = [2, 2, 3, 5] // we add nothing because [2, 3] are already in neededPrimes
i = 7: prime factors of 7 are [7] -> neededPrimes = [2, 2, 3, 5, 7]
i = 8: prime factors of 8 are [2, 2, 2] -> neededPrimes = [2, 2, 2, 3, 5, 7] // we add one 2 because the other 2's already exist in neededPrimes
i = 9: prime factors of 9 are [3, 3] -> neededPrimes = [2, 2, 2, 3, 3, 5, 7]
i = 10: prime factors of 10 are [2, 5] -> neededPrimes = [2, 2, 2, 3, 3, 5, 7]

x = 2 * 2 * 2 * 3 * 3 * 5 * 7 = 2520

Here's a code, and I hope it's clear:

public static void Main(string[] args)
{
    // The number to find its smallest multiple
    var n = 20;
    // A list that contains all primes that are founded across the calculation
    var calculatedPrimes = new List<int>();
    // Start through the numbers that x (the output number) should be divisible by
    for (var i = 2; i <= n; i++)
    {
        // Get primes of i
        var primes = GetPrimeFactors(i);
        // Loop through primes of i and add to "calculatedPrimes" the ones that are not 
        // in "calculatedPrimes"
        primes.ForEach(prime =>
        {
            if (!calculatedPrimes.Contains(prime) ||
                calculatedPrimes.Count(p => p == prime) < primes.Count(p => p == prime))
            {
                calculatedPrimes.Add(prime);
            }
        });
    }

    // The output number should be the multiple of all primes in "calculatedPrimes" list
    var x = calculatedPrimes.Aggregate(1, (res, p) => res * p);
    Console.WriteLine(x);

    Console.ReadLine();
}

// A function to get prime factors of a given number n
// (example: if n = 12 then this will return [2, 2, 3])
private static List<int> GetPrimeFactors(int n)
{
    var res = new List<int>();
    while (n % 2 == 0)
    {
        res.Add(2);
        n /= 2;
    }

    for (var i = 3; i <= Math.Sqrt(n); i += 2)
    { 
        while (n % i == 0)
        {
            res.Add(i);
            n /= i;
        }
    }

    if (n > 2)
        res.Add(n);
    return res;
}
like image 133
Ammar Avatar answered Oct 29 '25 02:10

Ammar



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!