I have an Entity Framework 6.1 project that is querying a SQL Server 2012 database table and getting back incorrect results.
To illustrate what is happening, I created 2 queries that should have the exact same results. The table ProjectTable
has 23 columns and 20500ish rows:
var test1 = db.ProjectTable
.GroupBy(t => t.ProjectOwner)
.Select(g => g.Key)
.ToArray();
var test2 = db.ProjectTable
.ToArray()
.GroupBy(t => t.ProjectOwner)
.Select(g => g.Key)
.ToArray();
The queries are designed to get a list of all of the distinct project owners in the table. The first query does the heavy lifting on the SQL Server, where as the second query downloads the entire table into memory and then processes it on the client side.
The first variable test1
has a length of about 300 items. The second variable test2
has a length of 5.
Here are the raw SQL queries that EF is generating:
-- test1
SELECT [Distinct1].[ProjectOwner] AS [ProjectOwner]
FROM ( SELECT DISTINCT
[Extent1].[ProjectOwner] AS [ProjectOwner]
FROM [dbo].[ProjectTable] as [Extent1]
) AS [Distinct1]
-- test2
SELECT Col1, Col2 ... ProjectOwner, ... Col23
FROM [dbo].[ProjectTable]
When I run this query and analyze the returned entities, I notice that the full 20500ish rows are returned, but the ProjectOwner
column gets overridden with one of only 5 different users!
var test = db.ProjectTable.ToArray();
I thought that maybe it was the SQL Server, so I did a packet trace and filtered on TDS. Randomly looking through the raw streams I see many names that aren't in the list of 5, so I know that data is being sent across the wire correctly.
How do I see the raw data that EF is getting? Is there something that might be messing with the cache and pulling incorrect results?
If I run the queries in either SSMS or Visual Studio, the list returned is correctly. It is only EF that has this issue.
Ok, I added another test to make sure my sanity is in check.
I took the test2
raw sql query and did the following:
var test3 = db.Database
.SqlQuery<ProjectTable>(@"SELECT Col1..Col23")
.ToArray()
.Select(t => t.ProjectOwner)
.Distict()
.ToArray();
and I get the correct 300ish names back!
After downloading the Entity Framework source and stepping through many an Enumerator
, I found the issue.
In the Shaper.HandleEntityAppendOnly
method (found here), on line 187 the Context.ObjectStateManager.FindEntityEntry
method is called. To my surprise, a non-null value was returned! Wait a minute, there shouldn't be any cached results, since I'm returning all rows?!
That's when I discovered that my Table has no Primary Key!
In my defence, the table is actually a cache of a view that I'm working with, I just did a SELECT * INTO CACHETABLE FROM USERVIEW
I then looked at which column Entity Framework thought was my Primary Key (they call it a singleton key) and it just so happens that the column they picked had only... drum roll please... 5 unique values!
When I looked at the model that EF generated, sure enough! That column was specified as a primary key. I changed the key to the appropriate column and now everything is working as it should!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With