Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Matching column values as prefixes

Tags:

sql

sql-server

I'm storing string prefixes in my SQL Server table, I'm wanting to see if any of those values are a valid prefix for a given parameter value.

e.g. supposing I have a telephone do-not-call list and it includes an entry to prohibit all phone calls to numbers starting with "1425123", rather than inserting 10000 numbers (14251230000 to 14251239999) it stores the prefix instead.

Like so:

CREATE TABLE Prefixes (
    Value varchar(10)
)

CREATE INDEX IX_Value UNIQUE Prefixes ( Value )

Evaluated like so:

DECLARE @value varchar(10) = 'foobar'

SELECT
    *
FROM
    Prefixes
WHERE
    @value LIKE ( Value + '%' );

When I run this in Azure SQL in SQL Server Management Studio it says it's performing an Index Scan. With approximately 70,000 entries on an Azure SQL S1 database the query takes between 200 and 500ms to execute. The tooling does not suggest any improvements to indexes for faster performance.

For comparison, an exact equality match (Value = @value) uses an Index Seek and happens near instantly.

200-500ms is too slow for my application.

One option is to move the lookup into my application code using a Trie for an efficient prefix search (which introduces synchronization problems), but another approach is to change the query to something like this:

DECLARE @v1 varchar(1) = LEFT( @value, 1 )
DECLARE @v2 varchar(2) = LEFT( @value, 2 )
DECLARE @v3 varchar(3) = LEFT( @value, 3 )
DECLARE @v4 varchar(4) = LEFT( @value, 4 )
DECLARE @v5 varchar(5) = LEFT( @value, 5 )
DECLARE @v6 varchar(6) = LEFT( @value, 6 )
DECLARE @v7 varchar(7) = LEFT( @value, 7 )
DECLARE @v8 varchar(8) = LEFT( @value, 8 )
DECLARE @v9 varchar(9) = LEFT( @value, 9 )

SELECT
    *
FROM
    Prefixes
WHERE
    Value = @v1 OR
    Value = @v2 OR
    Value = @v3 OR
    Value = @v4 OR
    Value = @v5 OR
    Value = @v6 OR
    Value = @v7 OR
    Value = @v8 OR
    Value = @v9

When I run this, it's a lot quicker (using an Index Seek) but it feels like a hack, but because I know the length is less than 10 characters I'm okay with it... for now.

Is there a better way? Is there a way SQL Server could do my prefix matching internally (i.e. using the same logic in my final example but without using repetitive and brittle SQL)?

like image 390
Dai Avatar asked Apr 01 '26 07:04

Dai


1 Answers

This is something an auxiliary numbers table can help with.

As you only need 1-10 I've made one inline in the query rather than assuming that one exists.

You could shorten the code by replacing the derived table V with a reference to a permanent numbers table if you have one or can create one.

SELECT IIF(EXISTS (SELECT *
                   FROM   (VALUES(1),(2),(3),
                                 (4),(5),(6),
                                 (7),(8),(9),(10)
                          ) V(number)
                          JOIN Prefixes P WITH(FORCESEEK)
                            ON P.Value = LEFT(@value, number)
                   WHERE  number <= LEN(@value)), 1, 0) AS PrefixExists 

enter image description here

  |--Compute Scalar(DEFINE:([Expr1014]=CASE WHEN [Expr1015] THEN (1) ELSE (0) END))
       |--Nested Loops(Left Semi Join, DEFINE:([Expr1015] = [PROBE VALUE]))
            |--Constant Scan
            |--Nested Loops(Inner Join, OUTER REFERENCES:([Union1010]))
                 |--Filter(WHERE:([Union1010]<=len([@value])))
                 |    |--Constant Scan(VALUES:(((1)),((2)),((3)),((4)),((5)),((6)),((7)),((8)),((9)),((10))))
                 |--Index Seek(OBJECT:([tempdb].[dbo].[Prefixes].[IX_Value] AS [P]), SEEK:([P].[Value]=substring([@value],(1),[Union1010])) ORDERED FORWARD)
like image 177
Martin Smith Avatar answered Apr 04 '26 01:04

Martin Smith



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!