Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Search by correlated entity

Hello I am using MVC 5 and Entity Framework 6 for my project. I have a model like in the following diagram:

diagram

And I need to query the entity product by starting from a set Of Tag objects. Please note that Tag object is an abstract class which actually is mapped by using the Table-Per-Entity strategy inheritance.

this is the signature of my function method

public IEnumerable<Product> SerachByTag( IEnumerable<Tag> tagList );

In the tagList parameter actually there will be concrete instances of Tag implementation.

How can I make this query?

For example I can receive in input the following data structure

[
    { tagType: 1, stringProperty: "abc" },
    { tagType: 2, intProperty: 9 }
]

and so on. Which would be the better way to filter products? For example I could certainly first apply a list of products for every single criteria and then intersect these results like in the following sample:

var p1 = ctx.Tags
            .OfType<FirstTagType>()
            .Where( x => x.StringProperty.Equals("abc") )
            .Select( x => x.Products );
var p2 = ctx.Tags
            .OfType<SecondTagType>()
            .Where( x => x.IntProperty == 9 )
            .Select( x => x.Products );
var results = p1.Intersect( p2 );

But my question in this case is about performances. How does this query behave with many filters?

like image 834
Lorenzo Avatar asked Nov 05 '15 23:11

Lorenzo


1 Answers

If you check out the generated SQL for your query, you'll find something similar:

SELECT 
[Intersect1].[ProductId] AS [C1], 
[Intersect1].[ProductName] AS [C2]
FROM  (SELECT 
    [Extent3].[ProductId] AS [ProductId], 
    [Extent3].[ProductName] AS [ProductName]
    FROM   [dbo].[FirstTag] AS [Extent1]
    INNER JOIN [dbo].[Tag] AS [Extent2] ON [Extent1].[TagId] = [Extent2].[TagId]
    LEFT OUTER JOIN [dbo].[Product] AS [Extent3] ON [Extent2].[Product_ProductId] = [Extent3].[ProductId]
    WHERE N'aaaa-9' = [Extent1].[StringProperty]
INTERSECT
    SELECT 
    [Extent6].[ProductId] AS [ProductId], 
    [Extent6].[ProductName] AS [ProductName]
    FROM   [dbo].[SecondTag] AS [Extent4]
    INNER JOIN [dbo].[Tag] AS [Extent5] ON [Extent4].[TagId] = [Extent5].[TagId]
    LEFT OUTER JOIN [dbo].[Product] AS [Extent6] ON [Extent5].[Product_ProductId] = [Extent6].[ProductId]
    WHERE -9 = [Extent4].[IntProperty]) AS [Intersect1]

Here, you can see that the inner select queries are doing exactly what you expect to do. The joins are based on foreign keys, and should be fast with indexes on the columns. So if you have many filters, you just need to make sure that they all work on properly indexed columns.

The LINQ Intersect is translated to a SQL INTERSECT, which works on all the columns of the "product" table. You might want to check out the actual execution plan on your side, it might depend on many things.

On my side what I see is that the SQL Server executes the first query, then on the result it calls a "Distinct Sort", and then to do the actual intersect it performs a "Left Semi Join" with the ProductId and the ProductName (so all the columns in the Product table). This might not be the best, because my guess is that you don't have an index on all columns.

One way to optimize this is to only do the intersect on the primary key (that should be fast), and then fetch all Product data based on the ids:

var p1 = ctx.Tags
    .OfType<FirstTag>()
    .Where(x => x.StringProperty.Equals("aaaa-9"))
    .Select(x => x.Product.ProductId);
var p2 = ctx.Tags
    .OfType<SecondTag>()
    .Where(x => x.IntProperty == -9)
    .Select(x => x.Product.ProductId);

var query = ctx.Products.Where(p => p1.Intersect(p2).Contains(p.ProductId));

The generated underlying SQL query uses EXISTS and the execution plan of that uses an inner join (on a primary key).

But, I wouldn't actually start this optimization process without first checking if you have a performance issue at all.

like image 155
Tamas Avatar answered Nov 02 '22 14:11

Tamas