Do the statistics (which help decide whether an index is to be used) take into account the number of rows per actual column value, or does it just use the average number of rows per value.
Suppose I have a table with an bit column called active which has a million of rows, but with 99.99% set to false. If I have an index on this column, then is Sql smart enough to know to use the index if searching for active=1 but that there is no point if searching for active=0.
Another example, if I have a table which has say 1,000,000 records with a indexed column which contains about 50,000 different values with an average number of rows per value of 10, but then one special value which has 500,000 rows. The index may not be useful if searching for this special record, but would be very useful when looking for any of the other codes.
But does this special case ruin the effectiveness of the index.
You can see for yourself:
CREATE TABLE IndexTest (
Id int not null primary key identity(1,1),
Active bit not null default(0),
IndexedValue nvarchar(10) not null
)
CREATE INDEX IndexTestActive ON IndexTest (Active)
CREATE INDEX IndexTestIndexedValue ON IndexTest (IndexedValue)
DECLARE @values table
(
Id int primary key IDENTITY(1, 1),
Value nvarchar(10)
)
INSERT INTO @values(Value) VALUES ('1')
INSERT INTO @values(Value) VALUES ('2')
INSERT INTO @values(Value) VALUES ('3')
INSERT INTO @values(Value) VALUES ('4')
INSERT INTO @values(Value) VALUES ('5')
INSERT INTO @values(Value) VALUES ('Many')
INSERT INTO @values(Value) VALUES ('Many')
INSERT INTO @values(Value) VALUES ('Many')
INSERT INTO @values(Value) VALUES ('Many')
INSERT INTO @values(Value) VALUES ('Many')
DECLARE @rowCount int
SET @rowCount = 100000
WHILE(@rowCount > 0)
BEGIN
DECLARE @valueIndex int
SET @valueIndex = CAST(RAND() * 10 + 1 as int)
DECLARE @selectedValue nvarchar(10)
SELECT @selectedValue = Value FROM @values WHERE Id = @valueIndex
DECLARE @isActive bit
SELECT @isActive = CASE
WHEN RAND() < 0.001 THEN 1
ELSE 0
END
INSERT INTO IndexTest(Active, IndexedValue) VALUES (@isActive, @selectedValue)
SET @rowCount = @rowCount - 1
END
SELECT count(*) FROM IndexTest WHERE Active = 1
SELECT count(*) FROM IndexTest WHERE Active = 0
SELECT count(*) FROM IndexTest WHERE IndexedValue = '1'
SELECT count(*) FROM IndexTest WHERE IndexedValue = 'Many'
It looks to me like it always uses the indexes on this query plan:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With