I have the following LINQ query:
var queryGroups = (from p in db.cl_contact_event
select new Groups { inputFileName = p.input_file_name }).Distinct();
Which translates to the following when run:
SELECT
[Distinct1].[C1] AS [C1],
[Distinct1].[input_file_name] AS [input_file_name]
FROM ( SELECT DISTINCT
[Extent1].[input_file_name] AS [input_file_name],
1 AS [C1]
FROM [mel].[cl_contact_event] AS [Extent1]
) AS [Distinct1]
Now I'm pretty sure that the reason there is a sub-select is because I have the base LINQ query surrounded by () and then perform .Distinct() but I don't know enough about LINQ to be sure of this. If that's indeed the case is there a way to restructure/code my query so that a sub-select doesn't occur?
I know that it probably seems that I'm just nit-picking here but I'm just curious.
In this I suspect that the actual root cause of the subquery is the anonymous type constructor. Because you are not selecting a known entity, but rather an arbitrary object constructed from other entity values, the EF parser needs to make sure it can produce the exact set of fields -- whether from a single table, joined tables, calculated fields, other sub-queries, etc. The expression tree parser is very good at writing SQL statements out of LINQ queries whenever possible, but it's not omniscient. It processes the queries in a systematic way, that will always produce correct results (in the sense that you get what you asked for), though not always optimal results.
As far as rewriting the query to eliminate the sub-select, first off: I don't see an obvious way to do so that eliminates the anonymous type and produces correct results. More importantly, though, I wouldn't bother. Modern SQL servers like Sybase are very smart -- often smarter than the developer -- and very good at producing an optimal query plan out of a query. Besides that, EF loves sub-queries, because they are very good ways to write complex queries in an automated fashion. You often find them even when your LINQ query did not appear use them. Trying to eliminate them all from your queries will quickly become an exercise in futility.
I wouldn't worry about this particular situation at all. SQL Server (and most likely any enterprise database) will optimize away the outer Select statement anyway. I would theorize that the reason this SQL statement is generated is because this is the most generic and reusable statement. From my experience, this always happens on Distinct()
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With