I have a non-clustered columnstore index on all columns a 40m record non-memory optimized table on SQL Server 2016 Enterprise Edition. A query forcing the use of the columnstore index will perform significantly faster but the optimizer continues to choose to use the clustered index and other non-clustered indexes. I have lots of available RAM and am using appropriate queries against a dimensional model. Why won't the optimizer choose the columnstoreindex? And how can I encourage its use (without using a hint)? Here is a sample query not using columnstore: <pre class="prettyprint"><code>SELECT COUNT(*), SUM(TradeTurnover), SUM(TradeVolume) FROM DWH.FactEquityTrade e --with (INDEX(FactEquityTradeNonClusteredColumnStoreIndex)) JOIN DWH.DimDate d ON e.TradeDateId = d.DateId JOIN DWH.DimInstrument i ON i.instrumentid = e.instrumentid WHERE d.DateId >= 20160201 AND i.instrumentid = 2 </code></pre> It takes 7 seconds without hint and a fraction of a second with the hint. The query plan without the hint is here. The query plan with the hint is here. The create statement for the columnstore index is: <pre class="prettyprint"><code>CREATE NONCLUSTERED COLUMNSTORE INDEX [FactEquityTradeNonClusteredColumnStoreIndex] ON [DWH].[FactEquityTrade] ( [EquityTradeID], [InstrumentID], [TradingSysTransNo], [TradeDateID], [TradeTimeID], [TradeTimestamp], [UTCTradeTimeStamp], [PublishDateID], [PublishTimeID], [PublishedDateTime], [UTCPublishedDateTime], [DelayedTradeYN], [EquityTradeJunkID], [BrokerID], [TraderID], [CurrencyID], [TradePrice], [BidPrice], [OfferPrice], [TradeVolume], [TradeTurnover], [TradeModificationTypeID], [InColumnStore], [TradeFileID], [BatchID], [CancelBatchID] ) WHERE ([InColumnStore]=(1)) WITH (DROP_EXISTING = OFF, COMPRESSION_DELAY = 0) ON [PRIMARY] GO </code></pre> Update. Plan using Count(EquityTradeID) instead of Count(*) and with hint included

You're asking SQL Server to choose a complicated query plan over a simple one. Note that when using the hint, SQL Server has to concatenate the columnstore index with a rowstore non-clustered index (<code>IX_FactEquiteTradeInColumnStore</code>). When using just the rowstore index, it can do a seek (I assume <code>TradeDateId</code> is the leading column on that index). It does still have to do a key lookup, but it's simpler. I can see two options to get this behavior without a hint: First, remove <code>InColumnStore</code> from the columnstore index definition and cover the entire table. That's what you're asking from the columnstore - to cover everything. If that's not possible, you can use a <code>UNION ALL</code> to explicitly split the data: <pre class="prettyprint"><code>WITH workaround AS ( SELECT TradeDateId , instrumentid , TradeTurnover , TradeVolume FROM DWH.FactEquityTrade WHERE InColumnStore = 1 UNION ALL SELECT TradeDateId , instrumentid , TradeTurnover , TradeVolume FROM DWH.FactEquityTrade WHERE InColumnStore = 0 -- Assuming this is a non-nullable BIT ) SELECT COUNT(*) , SUM(TradeTurnover) , SUM(TradeVolume) FROM workaround e JOIN DWH.DimDate d ON e.TradeDateId = d.DateId JOIN DWH.DimInstrument i ON i.instrumentid = e.instrumentid WHERE d.DateId >= 20160201 AND i.instrumentid = 2; </code></pre>

Why is columnstore index not being used

Tags:

sql-server

indexing

sql-server-2016

query-performance

columnstore

I have a non-clustered columnstore index on all columns a 40m record non-memory optimized table on SQL Server 2016 Enterprise Edition.

A query forcing the use of the columnstore index will perform significantly faster but the optimizer continues to choose to use the clustered index and other non-clustered indexes. I have lots of available RAM and am using appropriate queries against a dimensional model.

Why won't the optimizer choose the columnstoreindex? And how can I encourage its use (without using a hint)?

Here is a sample query not using columnstore:

SELECT
  COUNT(*),
  SUM(TradeTurnover),
  SUM(TradeVolume)
FROM DWH.FactEquityTrade e
--with (INDEX(FactEquityTradeNonClusteredColumnStoreIndex))
JOIN DWH.DimDate d
  ON e.TradeDateId = d.DateId
 JOIN DWH.DimInstrument i
  ON i.instrumentid = e.instrumentid
WHERE d.DateId >= 20160201
AND i.instrumentid = 2

It takes 7 seconds without hint and a fraction of a second with the hint. The query plan without the hint is here. The query plan with the hint is here.

The create statement for the columnstore index is:

CREATE NONCLUSTERED COLUMNSTORE INDEX [FactEquityTradeNonClusteredColumnStoreIndex] ON [DWH].[FactEquityTrade]
(
    [EquityTradeID],
    [InstrumentID],
    [TradingSysTransNo],
    [TradeDateID],
    [TradeTimeID],
    [TradeTimestamp],
    [UTCTradeTimeStamp],
    [PublishDateID],
    [PublishTimeID],
    [PublishedDateTime],
    [UTCPublishedDateTime],
    [DelayedTradeYN],
    [EquityTradeJunkID],
    [BrokerID],
    [TraderID],
    [CurrencyID],
    [TradePrice],
    [BidPrice],
    [OfferPrice],
    [TradeVolume],
    [TradeTurnover],
    [TradeModificationTypeID],
    [InColumnStore],
    [TradeFileID],
    [BatchID],
    [CancelBatchID]
)
WHERE ([InColumnStore]=(1))
WITH (DROP_EXISTING = OFF, COMPRESSION_DELAY = 0) ON [PRIMARY]
GO

Update. Plan using Count(EquityTradeID) instead of Count(*) and with hint included

555

asked May 11 '17 12:05

Rory

1 Answers

You're asking SQL Server to choose a complicated query plan over a simple one. Note that when using the hint, SQL Server has to concatenate the columnstore index with a rowstore non-clustered index (IX_FactEquiteTradeInColumnStore). When using just the rowstore index, it can do a seek (I assume TradeDateId is the leading column on that index). It does still have to do a key lookup, but it's simpler.

I can see two options to get this behavior without a hint:

First, remove InColumnStore from the columnstore index definition and cover the entire table. That's what you're asking from the columnstore - to cover everything.

If that's not possible, you can use a UNION ALL to explicitly split the data:

WITH workaround
     AS (
         SELECT TradeDateId
              , instrumentid
              , TradeTurnover
              , TradeVolume
         FROM DWH.FactEquityTrade
         WHERE InColumnStore = 1
         UNION ALL
         SELECT TradeDateId
              , instrumentid
              , TradeTurnover
              , TradeVolume
         FROM DWH.FactEquityTrade
         WHERE InColumnStore = 0 -- Assuming this is a non-nullable BIT
        )
     SELECT COUNT(*)
          , SUM(TradeTurnover)
          , SUM(TradeVolume)
     FROM workaround e
          JOIN DWH.DimDate d
            ON e.TradeDateId = d.DateId
          JOIN DWH.DimInstrument i
            ON i.instrumentid = e.instrumentid
     WHERE d.DateId >= 20160201
           AND i.instrumentid = 2;

169

answered Oct 18 '22 16:10

Steven Hibble

Related questions
                            
                                SQL Server multiple REPLACE with #temp table
                            
                                Is it possible to have two MSSQL persistence units in a transaction without XA?
                            
                                SQL View timing out used by .NET Application
                            
                                Determine TLS version from established SqlConnection
                            
                                How can I keep SSMS Find and replace in files "Look In" from changing
                            
                                Laravel never ending EXEC
                            
                                Maintaining subclass integrity in a relational database
                            
                                SSRS - Keep a table the same width when hiding columns dynamically?
                            
                                What is the meaning of session states in SQL Server, such as sleeping, suspended, running etc
                            
                                Get the T-SQL CREATE statement for SQLCLR stored procedures
                            
                                Sql Server Freetext through Entity Framework
                            
                                Invoke-SQLCmd with Different Credential
                            
                                SQL Server deadlock on the same table
                            
                                How do I group records having time difference of more than an hour?
                            
                                Bulk Insert from table to table
                            
                                EF 6 code-first with custom stored procedure
                            
                                Passing multiple value parameters in SSRS to stored procedure
                            
                                Reading from SQL Server with params: pandas (or pyodbc) not functioning properly
                            
                                SQL Server Recognise SP_EXECUTESQL as object rather than Procedure Name
                            
                                Find gaps in date ranges - TSQL

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With