Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to search millions of record in SQL table faster?

I have SQL table with millions of domain name. But now when I search for let's say

SELECT * 
  FROM tblDomainResults 
 WHERE domainName LIKE '%lifeis%'

It takes more than 10 minutes to get the results. I tried indexing but that didn't help.

What is the best way to store this millions of record and easily access these information in short period of time?

There are about 50 million records and 5 column so far.

like image 509
user737063 Avatar asked May 03 '11 23:05

user737063


People also ask

What is fastest way to execute the query with millions of records?

1:- Check Indexes. 2:- There should be indexes on all fields used in the WHERE and JOIN portions of the SQL statement 3:- Limit Size of Your Working Data Set. 4:- Only Select Fields You select as Need. 5:- Remove Unnecessary Table and index 6:- Remove OUTER JOINS.

How do I fetch more than 1000 records in SQL?

To query more than 1000 rows, there are two ways to go about this. Use the '$offset=' parameter by setting it to 1000 increments which will allow you to page through the entire dataset 1000 rows at a time. Another way is to use the '$limit=' parameter which will set a limit on how much you query from a dataset.


2 Answers

Most likely, you tried a traditional index which cannot be used to optimize LIKE queries unless the pattern begins with a fixed string (e.g. 'lifeis%').

What you need for your query is a full-text index. Most DBMS support it these days.

like image 125
Igor Nazarenko Avatar answered Sep 21 '22 11:09

Igor Nazarenko


Assuming that your 50 million row table includes duplicates (perhaps that is part of the problem), and assuming SQL Server (the syntax may change but the concept is similar on most RDBMSes), another option is to store domains in a lookup table, e.g.

CREATE TABLE dbo.Domains
(
    DomainID INT IDENTITY(1,1) PRIMARY KEY,
    DomainName VARCHAR(255) NOT NULL
);
CREATE UNIQUE INDEX dn ON dbo.Domains(DomainName);

When you load new data, check if any of the domain names are new - and insert those into the Domains table. Then in your big table, you just include the DomainID. Not only will this keep your 50 million row table much smaller, it will also make lookups like this much more efficient.

SELECT * -- please specify column names
FROM dbo.tblDomainResults AS dr
INNER JOIN dbo.Domains AS d
ON dr.DomainID = d.DomainID
WHERE d.DomainName LIKE '%lifeis%';

Of course except on the tiniest of tables, it will always help to avoid LIKE clauses with a leading wildcard.

like image 24
Aaron Bertrand Avatar answered Sep 23 '22 11:09

Aaron Bertrand