Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to simulate count distinct in an indexed view?

I'm working with a database hosted within Microsoft SQL Azure v12.0

I'm using index advisors to help determine where indexes would really be beneficial, without over-indexing my database.

I'm running into situations where indexed views would really help eliminate the need to GROUP BY large amounts of data every time someone runs a query.

I ran into a puzzling error while trying to craft an indexed view for one of my tables.
This is a pretty common scenario within my database, so I'd like to know if there is an all-around general solution that I can repurpose wherever needed.

The error I receive looks like this:

Msg 10126, Level 16, State 1, Line 12
Cannot create index on view "Database.dbo.vwCountDistinctExampleIX" because it uses aggregate "COUNT_BIG" with the DISTINCT keyword.
...
Consider use of a GROUP BY or COUNT_BIG(*) view to simulate DISTINCT on grouping columns.

The issue arises when I need to COUNT_BIG ( DISTINCT {expression} ) which is obviously not allowed within indexed views.
The error message seems to imply that you can simulate COUNT DISTINCT behavior and get the same results without using DISTINCT, which would be really nice since it would allow me to index that column and again help eliminate another type of expensive GROUP BY clause in common queries (since the execution plan can just refer to the index instead of regrouping the table data over and over).

I don't really understand how COUNT_BIG( * ) works in this context, so I don't understand what the error message is trying to suggest.
Is there a way I could get this aggregate value without using DISTINCT while still being able to index the aggregate column?

CREATE VIEW vwCountDistinctExampleIX
    WITH SCHEMABINDING
AS

SELECT
    t1.ID,
    COUNT_BIG( DISTINCT t2.Field2 ) AS DistinctTotal,
    COUNT_BIG( * )                  AS CountStar

FROM
    dbo.Table1 AS t1
    INNER JOIN dbo.Table2 AS t2 ON t2.Field1 = t1.ID

GROUP BY
    t1.ID

GO

/* Error here: Cannot create index on view "Database.dbo.vwCountDistinctExampleIX" because it uses aggregate "COUNT_BIG" with the DISTINCT keyword. */
CREATE UNIQUE CLUSTERED INDEX [PK vwCountDistinctExampleIX ID]  ON vwCountDistinctExampleIX ( [ID] )

CREATE INDEX [IX vwCountDistinctExampleIX DistinctTotal ON vwCountDistinctExampleIX ( DistinctTotal )

GO
like image 616
Giffyguy Avatar asked Sep 28 '22 10:09

Giffyguy


2 Answers

I don't think you can achieve what you want directly in an indexed view. However, if the number of rows in Table2 is big, but the number of distinct values in Field2 is not so big, the following approach might help:

CREATE TABLE Table1 (ID INT PRIMARY KEY)
CREATE TABLE Table2 (Field1 INT NOT NULL REFERENCES Table1, Field2 INT NOT NULL)

INSERT INTO Table1 VALUES (1),(2),(3)
INSERT INTO Table2 VALUES (1,10),(1,20),(1,20),(2,30)

GO
CREATE VIEW dbo.IndexedView1 
WITH SCHEMABINDING AS
SELECT ID, Field2, COUNT_BIG(*) AS Cnt
FROM dbo.Table1 T1
INNER JOIN dbo.Table2 T2 ON T1.ID = T2.Field1
GROUP BY ID, Field2

GO
CREATE UNIQUE CLUSTERED INDEX PK_IndexedView1 ON IndexedView1 (ID, Field2)

GO
CREATE VIEW SimpleView2
AS
SELECT ID, COUNT_BIG(*) AS DistinctTotal, SUM(Cnt) AS CountStar
FROM dbo.IndexedView1 X
GROUP BY ID
like image 129
Razvan Socol Avatar answered Oct 03 '22 07:10

Razvan Socol


You can index this query partially by grouping on both columns:

GROUP BY [T1].[ID], [T2].[Field2]

The query processor is smart enough to now use the index to compute the distinct.

Clearly, this view is less efficient than you might want but it is better than nothing.

like image 29
usr Avatar answered Oct 03 '22 07:10

usr