Following is an interview question:
The following one-liner generates and displays the list of first 500 prime numbers. How would you optimize it using parallel LINQ while still keeping it a SINGLE C# STATEMENT:
MessageBox.Show(string.Join(",",
Enumerable.Range(2, (int)(500 * (Math.Log(500) + Math.Log(System.Math.Log(500)) - 0.5)))
.Where(x => Enumerable.Range(2, x - 2)
.All(y => x % y != 0))
.TakeWhile((n, index) => index < 500)));
I tried introducing AsParallel()
as well as ParallelEnumerable
into the query, but did not see any tangible benefits on multi-core machines. The query still uses one CPU core heavily while other cores enjoy leisure time. Can someone suggest an improvement that would distribute the load equally on all cores and thus reduce execution time?
For the enthusiast: The following formula returns an upper bound which is guaranteed to be greater than N prime numbers, i.e. if you check up to this number, you'll sure find N primes smaller than it:
UpperBound = N * (Log(N) + Log(Log(N)) - 0.5) //Log is natural log
This does it nicely on my machine. I've never actually seen all my cores go to 100% until now. Thanks for giving me an excuse to play :)
I increased the numbers until I had a time slow enough to measure (20,000).
The key option that made the difference to me was setting the ExecutionMode to ForceParallelism.
Because I use a NotBuffered merge option, I re-sort it when I'm done. This would not be necessary if you don't care about the order of the results (perhaps you're putting the results in a HashSet).
The DegreeOfParallelism and MergeOptions only provided minor gains (if any) to performance on my machine. This example shows how to use all the options in a single Linq statement, which was the original question.
var numbers = Enumerable.Range(2, (int)(20000 * (Math.Log(20000) + Math.Log(System.Math.Log(20000)) - 0.5)))
.AsParallel()
.WithDegreeOfParallelism(Environment.ProcessorCount)
.WithExecutionMode(ParallelExecutionMode.ForceParallelism)
.WithMergeOptions(ParallelMergeOptions.NotBuffered) // remove order dependancy
.Where(x => Enumerable.Range(2, x - 2)
.All(y => x % y != 0))
.TakeWhile((n, index) => index < 20000);
string result = String.Join(",",numbers.OrderBy (n => n));
You can check only SQRT of value to do it (upgraded code from above)
var numbers = new[] {2, 3}.Union(Enumerable.Range(2, (int) (i*(Math.Log(i) + Math.Log(Math.Log(i)) - 0.5)))
.AsParallel()
.WithDegreeOfParallelism(Environment.ProcessorCount)
// 8 cores on my machine
.WithExecutionMode(ParallelExecutionMode.ForceParallelism)
.WithMergeOptions(ParallelMergeOptions.NotBuffered)
// remove order dependancy
.Where(x => Enumerable.Range(2, (int) Math.Ceiling(Math.Sqrt(x)))
.All(y => x%y != 0))
.TakeWhile((n, index) => index < i))
.ToList();
but it's crazy when you have a simple and extremly fast alhoritm:
private static IEnumerable<int> GetPrimes(int k)
{
int n = (int)Math.Ceiling((k * (Math.Log(k) + Math.Log(Math.Log(k)) - 0.5)));
bool[] prime = new bool[n + 1];
prime[0] = prime[1] = false;
for (int i = 2; i < prime.Length; i++)
{
prime[i] = true;
}
for (int i = 2; i*i <= n; ++i) // valid for n < 46340^2 = 2147395600
if (prime[i])
{
for (int j = i*i; j <= n; j += i)
prime[j] = false;
yield return i;
}
}
of course it isn't as good as LINQ because is not fashionable way to solve a problem, but you should know that it exists.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With