Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL group-by very slow

I have the folowwing SQL query

SELECT CustomerID FROM sales WHERE `Date` <= '2012-01-01' GROUP BY CustomerID

The query is executed over 11400000 rows and runs very slow. It takes over 3 minutes to execute. If I remove the group-by part, this runs below 1 second. Why is that?

MySQL Server version is '5.0.21-community-nt'

Here is the table schema:
CREATE TABLE `sales` (
  `ID` int(11) NOT NULL auto_increment,
  `DocNo` int(11) default '0',
  `CustomerID` int(11) default '0',
  `OperatorID` int(11) default '0',
  PRIMARY KEY  (`ID`),
  KEY `ID` (`ID`),
  KEY `DocNo` (`DocNo`),
  KEY `CustomerID` (`CustomerID`),
  KEY `Date` (`Date`)
) ENGINE=MyISAM AUTO_INCREMENT=14946509 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
like image 972
Treach Avatar asked Apr 23 '12 10:04

Treach


People also ask

Does GROUP BY Make query slow?

Most of your queries are logically invalid, because you are selecting non aggregate columns not mentioned in the GROUP BY clause. That being said, aggregation takes time, and slowness is to be expected, assuming your tables are fairly large.

Is GROUP BY slow in SQL?

Conclusion. GROUP BY is a powerful statement, but it tends to slow down queries. Over time, my team and I have used it many times and defined SQL indexes to avoid the performance issues introduced by the GROUP BY clause, especially when dealing with large tables.

How to optimize GROUP BY MySQL?

The most general way to satisfy a GROUP BY clause is to scan the whole table and create a new temporary table where all rows from each group are consecutive, and then use this temporary table to discover groups and apply aggregate functions (if any).


2 Answers

Try putting an index on (Date,CustomerID).

Have a look at the mysql manual for optimizing group by queries:- Group by optimization

You can find out how mysql is generating the result if you use EXPLAIN as follows:-

EXPLAIN SELECT CustomerID FROM sales WHERE `Date` <= '2012-01-01' GROUP BY CustomerID

This will tell you which indexes (if any) mysql is using to optimize the query. This is very handy when learning which indexes work for which queries as you can try creating an index and see if mysql uses it. So even if you don't fully understand how mysql calculates aggregate queries you can create a useful index by trial and error.

like image 50
rgvcorley Avatar answered Sep 19 '22 06:09

rgvcorley


Without knowing what your table schema looks like, it's difficult to be certain, but it would probably help if you added a multiple-column index on Date and CustomerID. That'd save MySQL the hassle of doing a full table scan for the GROUP BY statement. So try ALTER TABLE sales ADD INDEX (Date,CustomerID).

like image 24
Daan Avatar answered Sep 20 '22 06:09

Daan