Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

very slow count with 7 million rows

I got more than 7 million rows in a table and

SELECT COUNT(*) FROM MyTable where MyColumn like '%some string%'

gives me 20,000 rows and takes more than 13 seconds.

The table has NONCLUSTERED INDEX on MyColumn.

Is there any way to improve speed?

like image 379
jong shin Avatar asked Nov 03 '11 06:11

jong shin


People also ask

How do you speed up a count query?

So to make SELECT COUNT(*) queries fast, here's what to do: Get on any version that supports batch mode on columnstore indexes, and put a columnstore index on the table – although your experiences are going to vary dramatically depending on the kind of query you have.

How to search millions of records in SQL table faster?

When you load new data, check if any of the domain names are new - and insert those into the Domains table. Then in your big table, you just include the DomainID. Not only will this keep your 50 million row table much smaller, it will also make lookups like this much more efficient.

What is the most performant way to get the total number of records from a table?

The best way to get the record count is to use the sys. dm_db_partition_stats or sys. partitions system views (there is also sysindexes, but it has been left for the backward compatibility with SQL Server 2000).


3 Answers

Leading wildcards searches can not be optimised with T-SQL and won't use an index

Look at SQL Server's full text search

like image 57
gbn Avatar answered Oct 02 '22 08:10

gbn


You could try a full-text search, or a text search engine such as Lucene.

like image 34
Mark Byers Avatar answered Oct 02 '22 08:10

Mark Byers


Try using a binary collation first, which will mean that the complex Unicode rules are replaced by a simple byte comparison.

SELECT COUNT(*) 
FROM MyTable 
WHERE MyColumn COLLATE Latin1_General_BIN2 LIKE '%some string%'

Also, have a look at chapter titled 'Build your own index' in SQL Server MVP Deep Dives written by Erland Sommarskog

The basic idea is that you introduce a restriction to the user and require the string to be at least three contiguous characters long. Next, you extract all three letter sequences from the MyColumn field and store these fragments in a table together with the MyTable.id they belong to. When looking for a string, you split it into three letter fragments as well, and look up which record id they belong to. This way you find the matching strings a lot quicker. This is the strategy in a nutshell.

The book describes implementation details and ways to optimise this further.

like image 44
Chris Bednarski Avatar answered Oct 05 '22 08:10

Chris Bednarski