Optimising LIKE expressions that start with wildcards

Tags:

I have a table in a SQL Server database with an address field (ex. 1 Farnham Road, Guildford, Surrey, GU2XFF) which I want to search with a wildcard before and after the search string.

SELECT *
FROM Table
WHERE Address_Field LIKE '%nham%'

I have around 2 million records in this table and I'm finding that queries take anywhere from 5-10s, which isn't ideal. I believe this is because of the preceding wildcard.

I think I'm right in saying that any indexes won't be used for seek operations because of the preceeding wildcard.

Using full text searching and CONTAINS isn't possible because I want to search for the latter parts of words (I know that you could replace the search string for Guil* in the below query and this would return results). Certainly running the following returns no results

SELECT *
FROM Table
WHERE CONTAINS(Address_Field, '"nham"')

Is there any way to optimise queries with preceding wildcards?

448

asked Jan 26 '17 17:01

hwilson1

2 Answers

Here is one (not really recommended) solution.

Create a table AddressSubstrings. This table would have multiple rows per address and the primary key of table.

When you insert an address into table, insert substrings starting from each position. So, if you want to insert 'abcd', then you would insert:

abcd
bcd
cd
d

along with the unique id of the row in Table. (This can all be done using a trigger.)

Create an index on AddressSubstrings(AddressSubstring).

Then you can phrase your query as:

SELECT *
FROM Table t JOIN
     AddressSubstrings ads
     ON t.table_id = ads.table_id
WHERE ads.AddressSubstring LIKE 'nham%';

Now there will be a matching row starting with nham. So, like should make use of an index (and a full text index also works).

If you are interesting in the right way to handle this problem, a reasonable place to start is the Postgres documentation. This uses a method similar to the above, but using n-grams. The only problem with n-grams for your particular problem is that they require re-writing the comparison as well as changing the storing.

answered Sep 30 '22 19:09

Gordon Linoff

I can't offer a complete solution to this difficult problem.

But if you're looking to create a suffix search capability, in which, for example, you'd be able to find the row containing HWilson with ilson and the row containing ABC123000654 with 654, here's a suggestion.

  WHERE REVERSE(textcolumn) LIKE REVERSE('ilson') + '%'

Of course this isn't sargable the way I wrote it here. But many modern DBMSs, including recent versions of SQL server, allow the definition, and indexing, of computed or virtual columns.

I've deployed this technique, to the delight of end users, in a health-care system with lots of record IDs like ABC123000654.

answered Sep 30 '22 18:09

O. Jones

Related questions
                            
                                Will SQL update affect its subquery during the update run?
                            
                                Inserts in Merge Replication database are insanely slow
                            
                                SQL group by select
                            
                                Which is the best way to perform pagination on SQL Server?
                            
                                Connect to SQL Server 2012 Database with C# (Visual Studio 2012)
                            
                                SQL Server Performance ResultSet vs Output Parameter vs Return Value
                            
                                How to get a count even if there are no results corresponding mysql?
                            
                                SQL join one to many relationship - count number of votes per image?
                            
                                In SQL Server, like "use <<DatabaseName>>", how to "use <<ServerName>>" Query command
                            
                                How to return the number of affected rows in a HANA stored procedure?
                            
                                How to insert a updatable record with JSON column in PostgreSQL using JOOQ?
                            
                                Oracle result without group by
                            
                                Repository and query objects pattern. How to implement complex queries
                            
                                PostgreSQL performance difference between LIKE and regex
                            
                                JOOQ: how do I add an interface to a generated Record Class
                            
                                MySQL: column size limit
                            
                                How to Copy Data From Sql Object to C# Model Property
                            
                                ADO Recordset data not showing on form
                            
                                Pandas read_sql query with multiple selects
                            
                                SSRS - Empty value

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Optimising LIKE expressions that start with wildcards

Tags:

sql

sql-server

indexing

sql-like

wildcard

hwilson1

People also ask

2 Answers

Gordon Linoff

O. Jones

Recent Activity

Donate For Us