I have code like this:
var list = new List<int> {1, 2, 3, 4, 5};
var result = from x in list.AsParallel()
let a = LongRunningCalc1(x)
let b = LongRunningCalc2(x)
select new {a, b};
Let's say the LongRunningCalc
methods each take 1 second. The code above takes about 2 seconds to run, because while the list of 5 elements is operated on in parallel, the two methods called from the let
statements are called sequentially.
However, these methods can safely be called in parallel also. They obviously need to merge back for the select
but until then should run in parallel - the select
should wait for them.
Is there a way to achieve this?
When you write a query, opt in to PLINQ by invoking the ParallelEnumerable. AsParallel extension method on the data source, as shown in the following example. The AsParallel extension method binds the subsequent query operators, in this case, where and select , to the System. Linq.
By default, only one thread is used to execute a LINQ query. Parallel LINQ (PLINQ) is an easy way to enable multiple threads to execute a query. To see it in action, we will start with some code that only uses a single thread to double 200 million integers.
AsParallel(IEnumerable)Enables parallelization of a query. public: [System::Runtime::CompilerServices::Extension] static System::Linq::ParallelQuery ^ AsParallel(System::Collections::IEnumerable ^ source); C# Copy. public static System.Linq.
You won't be able to use query syntax or the let
operation, but you can write a method to perform multiple operations for each item in parallel:
public static ParallelQuery<TFinal> SelectAll<T, TResult1, TResult2, TFinal>(
this ParallelQuery<T> query,
Func<T, TResult1> selector1,
Func<T, TResult2> selector2,
Func<TResult1, TResult2, TFinal> resultAggregator)
{
return query.Select(item =>
{
var result1 = Task.Run(() => selector1(item));
var result2 = Task.Run(() => selector2(item));
return resultAggregator(result1.Result, result2.Result);
});
}
This would allow you to write:
var query = list.AsParallel()
.SelectAll(LongRunningCalc1,
LongRunningCalc2,
(a, b) => new {a, b})
You can add overloads for additional parallel operations as well:
public static ParallelQuery<TFinal> SelectAll<T, TResult1, TResult2, TResult3, TFinal>
(this ParallelQuery<T> query,
Func<T, TResult1> selector1,
Func<T, TResult2> selector2,
Func<T, TResult3> selector3,
Func<TResult1, TResult2, TResult3, TFinal> resultAggregator)
{
return query.Select(item =>
{
var result1 = Task.Run(() => selector1(item));
var result2 = Task.Run(() => selector2(item));
var result3 = Task.Run(() => selector3(item));
return resultAggregator(
result1.Result,
result2.Result,
result3.Result);
});
}
It's possible to write a version to handle a number of selectors not known at compile time, but to do that they all need to compute a value of the same type:
public static ParallelQuery<IEnumerable<TResult>> SelectAll<T, TResult>(
this ParallelQuery<T> query,
IEnumerable<Func<T, TResult>> selectors)
{
return query.Select(item => selectors.AsParallel()
.Select(selector => selector(item))
.AsEnumerable());
}
public static ParallelQuery<IEnumerable<TResult>> SelectAll<T, TResult>(
this ParallelQuery<T> query,
params Func<T, TResult>[] selectors)
{
return SelectAll(query, selectors);
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With