I have a linq statement like this:
var records = from line in myfile
let data = line.Split(',')
select new { a=int.Parse(data[0]), b=int.Parse(data[1]) };
var average = records.Sum(r => r.b)!=0?records.Sum(r => r.a) / records.Sum(r => r.b):0;
My question is: How many times records.Sum(r => r.b) is computed in the last line? Does LINQ loop over all the records each time when it needs to compute a sum (in this case, 3 Sum() so loop 3 times)? Or does it smartly loop over all the records just once andcompute all the sums?
Edit 1:
I wonder if there is any way to improve it by only going through all the records just once (as we only need to do it in a single loop when use a plain for loop)?
And there is really no need to load everything into memory before we can do the sum and average. Surely we can sum each element while loading it from the file. Is there any way to reduce the memory consumption as well?
Edit 2
Just to clarify a bit, I didn't use LINQ before I ended up like above. Using plain while/for loop can achieve all the performance requirements. But I then tried to improve the readability and also reduce the lines of code by using LINQ. It seems that we can't get both at the same time.
It is slightly slowerLINQ syntax is typically less efficient than a foreach loop. It's good to be aware of any performance tradeoff that might occur when you use LINQ to improve the readability of your code. And if you'd like to measure the performance difference, you can use a tool like BenchmarkDotNet to do so.
A stored procedure is the best way for writing complex queries as compared to LINQ. Deploying a LINQ based application is much easy and simple as compared to stored procedures based.
In general, for identical code, linq will be slower, because of the overhead of delegate invocation. You use an array to store the data. You use a for loop to access each element (as opposed to foreach or linq). Save this answer.
Twice, write it like this and it will be once:
var sum = records.Sum(r => r.b);
var avarage = sum != 0 ? records.Sum(r => r.a)/sum: 0;
There are plenty of answers, but none that wrap all of your questions up.
How many times records.Sum(r => r.b) is computed in the last line?
Three times.
Does LINQ loop over all the records each time when it needs to compute a sum (in this case, 3 Sum() so loop 3 times)?
Yes.
Or does it smartly loop over all the records just once andcompute all the sums?
No.
I wonder if there is any way to improve it by only going through all the records just once (as we only need to do it in a single loop when use a plain for loop)?
You can do that, but it requires you to eagerly-load all the data which contradicts your next question.
And there is really no need to load everything into memory before we can do the sum and average. Surely we can sum each element while loading it from the file. Is there any way to reduce the memory consumption as well?
That's correct. In your original post you have a variable called myFile
and you're iterating over it and putting it into a local variable called line
(read: basically a foreach
). Since you didn't show how you got your myFile
data, I'm assuming that you're eagerly loading all the data.
Here's a quick example of lazy-loading your data:
public IEnumerable<string> GetData()
{
using (var fileStream = File.OpenRead(@"C:\Temp\MyData.txt"))
{
using (var streamReader = new StreamReader(fileStream))
{
string line;
while ((line = streamReader.ReadLine()) != null)
{
yield return line;
}
}
}
}
public void CalculateSumAndAverage()
{
var sumA = 0;
var sumB = 0;
var average = 0;
foreach (var line in GetData())
{
var split = line.Split(',');
var a = Convert.ToInt32(split[0]);
var b = Convert.ToInt32(split[1]);
sumA += a;
sumB += b;
}
// I'm not a big fan of ternary operators,
// but feel free to convert this if you so desire.
if (sumB != 0)
{
average = sumA / sumB;
}
else
{
// This else clause is redundant, but I converted it from a ternary operator.
average = 0;
}
}
Three times, and what you should use here is Aggregate
, not Sum
.
// do your original selection
var records = from line in myfile
let data = line.Split(',')
select new { a=int.Parse(data[0]), b=int.Parse(data[1]) };
// aggregate them into one record
var sumRec = records.Aggregate((runningSum, next) =>
{
runningSum.a += next.a;
runningSum.b += next.b;
return runningSum;
});
// Calculate your average
var average = sumRec.b != 0 ? sumRec.a / sumRec.b : 0;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With