Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select Specific Columns from Database using EF Code First

We have a customer very large table with over 500 columns (i know someone does that!)

Many of these columns are in fact foreign keys to other tables.

We also have the requirement to eager load some of the related tables.

Is there any way in Linq to SQL or Dynamic Linq to specify what columns to be retrieved from the database? I am looking for a linq statement that actually HAS this effect on the generated SQL Statement:

SELECT Id, Name FROM Book

When we run the reguar query generated by EF, SQL Server throws an error that you have reached the maximum number of columns that can be selected within a query!!!

Any help is much appreciated!


Yes exactly this is the case, the table has 500 columns and is self referencing our tool automatically eager loads the first level relations and this hits the SQL limit on number of columns that can be queried.

I was hoping that I can set to only load limited columns of the related Entities such as Id and Name (which is used in the UI to view the record to user)

I guess the other option is to control what FK columns should be eager loaded. However this still remains problem for tables that has a binary or ntext column which you may not want to load all the times.

Is there a way to hook multiple models (Entities) to the same table in Code First? We tried doing this I think the effort failed miserably.

like image 768
sam360 Avatar asked Jul 17 '12 18:07

sam360


People also ask

How do I select specific columns in Entity Framework?

We can do that simply by using the “new” operator and selecting the properties from the object that we need. In this case, we only want to retrieve the Id and Title columns. There. That looks better.

How do I select a specific column in a query?

To select columns, choose one of the following options: Type SELECT , followed by the names of the columns in the order that you want them to appear on the report. Use commas to separate the column names.

What difference does AsNoTracking () make?

AsNoTracking(IQueryable)Returns a new query where the entities returned will not be cached in the DbContext or ObjectContext. This method works by calling the AsNoTracking method of the underlying query object.


1 Answers

Yes you can return only subset of columns by using projection:

var result = from x in context.LargeTable
             select new { x.Id, x.Name };

The problem: projection and eager loading doesn't work together. Once you start using projections or custom joins you are changing shape of the query and you cannot use Include (EF will ignore it). The only way in such scenario is to manually include relations in the projected result set:

var result = from x in context.LargeTable
             select new {
                 Id = x.Id,
                 Name = x.Name,
                 // You can filter or project relations as well
                 RelatedEnitites = x.SomeRelation.Where(...) 
             };

You can also project to specific type BUT that specific type must not be mapped (so you cannot for example project to LargeTable entity from my sample). Projection to the mapped entity can be done only on materialized data in Linq-to-objects.

Edit:

There is probably some misunderstanding how EF works. EF works on top of entities - entity is what you have mapped. If you map 500 columns to the entity, EF simply use that entity as you defined it. It means that querying loads entity and persisting saves entity.

Why it works this way? Entity is considered as atomic data structure and its data can be loaded and tracked only once - that is a key feature for ability to correctly persist changes back to the database. It doesn't mean that you should not load only subset of columns if you need it but you should understand that loading subset of columns doesn't define your original entity - it is considered as arbitrary view on data in your entity. This view is not tracked and cannot be persisted back to database without some additional effort (simply because EF doesn't hold any information about the origin of the projection).

EF also place some additional constraints on the ability to map the entity

  • Each table can be normally mapped only once. Why? Again because mapping table multiple times to different entities can break ability to correctly persist those entities - for example if any non-key column is mapped twice and you load instance of both entities mapped to the same record, which of mapped values will you use during saving changes?
  • There are two exceptions which allow you mapping table multiple times
    • Table per hierarchy inheritance - this is a mapping where table can contains records from multiple entity types defined in inheritance hierarchy. Columns mapped to the base entity in the hierarchy must be shared by all entities. Every derived entity type can have its own columns mapped to its specific properties (other entity types have these columns always empty). It is not possible to share column for derived properties among multiple entities. There must also be one additional column called discriminator telling EF which entity type is stored in the record - this columns cannot be mapped as property because it is already mapped as type discriminator.
    • Table splitting - this is direct solution for the single table mapping limitation. It allows you to split table into multiple entities with some constraints:
      • There must be one-to-one relation between entities. You have one central entity used to load the core data and all other entities are accessible through navigation properties from this entity. Eager loading, lazy loading and explicit loading works normally.
      • The relation is real 1-1 so both parts or relation must always exists.
      • Entities must not share any property except the key - this constraint will solve the initial problem because each modifiable property is mapped only once
      • Every entity from the split table must have a mapped key property
      • Insertion requires whole object graph to be populated because other entities can contain mapped required columns

Linq-to-Sql also contains ability to mark a column as lazy loaded but this feature is currently not available in EF - you can vote for that feature.

It leads to your options for optimization

  • Use projections to get read-only "view" for entity
    • You can do that in Linq query as I showed in the previous part of this answer
    • You can create database view and map it as a new "entity"
    • In EDMX you can also use Defining query or Query view to encapsulate either SQL or ESQL projection in your mapping
  • Use table splitting
    • EDMX allows you splitting table to many entities without any problem
    • Code first allows you splitting table as well but there are some problems when you split table to more than two entities (I think it requires each entity type to have navigation property to all other entity types from split table - that makes it really hard to use).
like image 172
Ladislav Mrnka Avatar answered Oct 01 '22 13:10

Ladislav Mrnka