I’m currently having a debate with someone at work regarding the performance of LINQ queries against the SQL equivalent.
Has anyone done/seen any scientific testing on this?
If not, anecdotal evidence of where you’ve had to replace LINQ with a SQL query for performance reasons will help me make my case.
More importantly: when it comes to querying databases, LINQ is in most cases a significantly more productive querying language than SQL. Compared to SQL, LINQ is simpler, tidier, and higher-level. It's rather like comparing C# to C++.
LINQ syntax is typically less efficient than a foreach loop. It's good to be aware of any performance tradeoff that might occur when you use LINQ to improve the readability of your code. And if you'd like to measure the performance difference, you can use a tool like BenchmarkDotNet to do so.
We can see right away that LINQ is a lot slower than raw SQL, but compiled LINQ is a bit faster. Note that results are in microseconds; real-world queries may take tens or even hundreds of milliseconds, so LINQ overhead will be hardly noticeable.
I have a slightly different take on this; when profiling (with our shiny profiler) we noticed that LINQ (to SQL in this case) was doing a reasonable job generating TSQL for basic queries, and the operation was running very fast at the DB server (0.5ms etc) - however, the actual query operation was taking MUCH longer (like 20ms+ for the same 0.5ms query, in some cases). So where was the time? You might think "query translation", but no; we also have a lot of ExecuteQuery<T>
code (i.e. where you write the TSQL by hand) and this was doing exactly the same thing. It turned out that somewhere between the materializer and the identity map vast amounts of time was being lost.
So; we wrote our own materializer that was pretty-much a drop-in replacement for ExecuteQuery
- and thus was born dapper.
On more of the LINQ side, it generally does OK at generating TSQL for simple queries, but for anything complex I usually trust hand-coded TSQL a lot more. To take a case as a sample, I had a complex query involving groups, skips and takes. It didn't perform well. When I wrote it by hand with ROW_NUMBER() etc the same results took 4% of the "stats IO" and total time.
My current opinion on LINQ is that the ORM tools make data mutation a breeze, but for query I tend to use dapper. Which is ironic since the Q in LINQ is "query".
LINQ as in LINQ2SQL or EF needs to have a generalized ruleset on how to convert the LINQ queries to SQL and with this introduces alevel of abstraction. This abstraction will sometimes result in statements that are less-than-optimal. Writing your SQL statements by hand allows you to tweak it for your specific case.
This leads to the conclusion that the speed of SQL is more likely to be faster, especially in complex scenarios. But this doesn't mean that every LINQ query is slower than its SQL equivalent.
My experience with EF 1 in combination with Oracle supports this.
One of the more obvious examples where SQL query is better is update/delete batching. There is no facility built into LINQ-To-SQL to handle multiple updates or deletes, except to process the records one by one, which generates individual queries for each record.
var myRecords = myContext.Books.Where(b => b.Author = "Bob");
foreach (var rec in myRecords)
myContext.Books.DeleteOnSubmit(rec);
myContext.SubmitChanges() // generates delete statement for each record.
In the general case, LINQ is optimized to generate very efficient queries, and can sometimes generate even better queries than the intuitive way to write it with SQL. However, there will always be exceptions where the LINQ-to-SQL statements will not be as efficient as a SQL query that can be written by hand.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With