Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is columnstore index not being used

I have a non-clustered columnstore index on all columns a 40m record non-memory optimized table on SQL Server 2016 Enterprise Edition.

A query forcing the use of the columnstore index will perform significantly faster but the optimizer continues to choose to use the clustered index and other non-clustered indexes. I have lots of available RAM and am using appropriate queries against a dimensional model.

Why won't the optimizer choose the columnstoreindex? And how can I encourage its use (without using a hint)?

Here is a sample query not using columnstore:

SELECT
  COUNT(*),
  SUM(TradeTurnover),
  SUM(TradeVolume)
FROM DWH.FactEquityTrade e
--with (INDEX(FactEquityTradeNonClusteredColumnStoreIndex))
JOIN DWH.DimDate d
  ON e.TradeDateId = d.DateId
 JOIN DWH.DimInstrument i
  ON i.instrumentid = e.instrumentid
WHERE d.DateId >= 20160201
AND i.instrumentid = 2

It takes 7 seconds without hint and a fraction of a second with the hint. The query plan without the hint is here. The query plan with the hint is here.

The create statement for the columnstore index is:

CREATE NONCLUSTERED COLUMNSTORE INDEX [FactEquityTradeNonClusteredColumnStoreIndex] ON [DWH].[FactEquityTrade]
(
    [EquityTradeID],
    [InstrumentID],
    [TradingSysTransNo],
    [TradeDateID],
    [TradeTimeID],
    [TradeTimestamp],
    [UTCTradeTimeStamp],
    [PublishDateID],
    [PublishTimeID],
    [PublishedDateTime],
    [UTCPublishedDateTime],
    [DelayedTradeYN],
    [EquityTradeJunkID],
    [BrokerID],
    [TraderID],
    [CurrencyID],
    [TradePrice],
    [BidPrice],
    [OfferPrice],
    [TradeVolume],
    [TradeTurnover],
    [TradeModificationTypeID],
    [InColumnStore],
    [TradeFileID],
    [BatchID],
    [CancelBatchID]
)
WHERE ([InColumnStore]=(1))
WITH (DROP_EXISTING = OFF, COMPRESSION_DELAY = 0) ON [PRIMARY]
GO

Update. Plan using Count(EquityTradeID) instead of Count(*) and with hint included

like image 555
Rory Avatar asked May 11 '17 12:05

Rory


People also ask

When should I use a Columnstore index?

Columnstore indexes give high performance gains for analytic queries that scan large amounts of data, especially on large tables. Use columnstore indexes on data warehousing and analytics workloads, especially on fact tables, because they tend to require full table scans rather than table seeks.

What is true about Columnstore index?

Columnstore index is a new type of index introduced in SQL Server 2012. It is a column-based non-clustered index geared toward increasing query performance for workloads that involve large amounts of data, typically found in data warehouse fact tables.

Do Columnstore indexes help query performance?

Columnstore indexes achieve up to 10x greater data compression than rowstore indexes. This greatly reduces the I/O required to execute analytics queries and therefore improves query performance.

Which of the following scenarios are likely well suited for a Columnstore index?

Columnstore indexes work well when the data is stable. Queries should be updating and deleting less than 10% of the rows.


1 Answers

You're asking SQL Server to choose a complicated query plan over a simple one. Note that when using the hint, SQL Server has to concatenate the columnstore index with a rowstore non-clustered index (IX_FactEquiteTradeInColumnStore). When using just the rowstore index, it can do a seek (I assume TradeDateId is the leading column on that index). It does still have to do a key lookup, but it's simpler.

I can see two options to get this behavior without a hint:

First, remove InColumnStore from the columnstore index definition and cover the entire table. That's what you're asking from the columnstore - to cover everything.

If that's not possible, you can use a UNION ALL to explicitly split the data:

WITH workaround
     AS (
         SELECT TradeDateId
              , instrumentid
              , TradeTurnover
              , TradeVolume
         FROM DWH.FactEquityTrade
         WHERE InColumnStore = 1
         UNION ALL
         SELECT TradeDateId
              , instrumentid
              , TradeTurnover
              , TradeVolume
         FROM DWH.FactEquityTrade
         WHERE InColumnStore = 0 -- Assuming this is a non-nullable BIT
        )
     SELECT COUNT(*)
          , SUM(TradeTurnover)
          , SUM(TradeVolume)
     FROM workaround e
          JOIN DWH.DimDate d
            ON e.TradeDateId = d.DateId
          JOIN DWH.DimInstrument i
            ON i.instrumentid = e.instrumentid
     WHERE d.DateId >= 20160201
           AND i.instrumentid = 2;
like image 169
Steven Hibble Avatar answered Oct 18 '22 16:10

Steven Hibble