Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQLdb is extremely slow with large result sets

I executed the following query both in phpMyAdmin & MySQLdb (python).

SELECT *, (SELECT CONCAT(`id`, '|', `name`, '|', `image_code`)
FROM `model_artist` WHERE `id` = `artist_id`) as artist_data, 
FIND_IN_SET("metallica", `searchable_words`) as find_0
FROM `model_song` HAVING find_0

phpMyAdmin said the query took 2ms. My python code said that using MySQLdb the query took 848ms (without even fetching the results).

The python code:

self.db = MySQLdb.connect(host="localhost", user="root", passwd="", db="ibeat")
self.cur = self.db.cursor()

millis = lambda: time.time() * 1000

start_time = millis()
self.cur.execute_cmd("""SELECT *, (SELECT CONCAT(`id`, '|', `name`, '|', `image_code`)
FROM `model_artist` WHERE `id` = `artist_id`) as artist_data, 
FIND_IN_SET("metallica", `searchable_words`) as find_0
FROM `model_song` HAVING find_0""")
print millis() - start_time
like image 509
Zippo Avatar asked May 13 '11 15:05

Zippo


People also ask

Why MySQL could be slow with large tables?

In all likelihood, the table does not fit into memory on your machine, so this results in about 1300 accesses to the files on disk. That explains why you are seeing a lag time of several seconds. Another solution is to ensure that the data tables themselves fit into memory.

Why does MySQL query take so long to execute?

There are a number of things that may cause a query to take longer time to execute: Inefficient query – Use non-indexed columns while lookup or joining, thus MySQL takes longer time to match the condition. Table lock – The table is locked, by global lock or explicit table lock when the query is trying to access it.


2 Answers

If you expect an SQL query to have a large result set which you then plan to iterate over record-by-record, then you may want to consider using the MySQLdb SSCursor instead of the default cursor. The default cursor stores the result set in the client, whereas the SSCursor stores the result set in the server. Unlike the default cursor, the SSCursor will not incur a large initial delay if all you need to do is iterate over the records one-by-one.

You can find a bit of example code on how to use the SSCursor here.

For example, try:

import MySQLdb.cursors

self.db = MySQLdb.connect(host="localhost", user="root", passwd="", db="ibeat",
                          cursorclass = MySQLdb.cursors.SSCursor)

(The rest of the code can remain the same.)

like image 148
unutbu Avatar answered Sep 29 '22 18:09

unutbu


PHPMyAdmin places a limit on all queries so you don't return large result sets in the interface. So if your query normally returns 1,000,000 rows, and PHPMyAdmin reduces that to 1,000 (or whatever the default is), then you would have to expect a lot longer processing time when Python grabs or even queries the entire result set.

Try placing a limit in Python that matches the limit on PHPMyAdmin to compare times.

like image 33
Mikecito Avatar answered Sep 29 '22 18:09

Mikecito