Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does executing a statement always take in memory for the result set?

I was told by a colleague that executing an SQL statement always puts the data into RAM/swap by the database server. Thus it is not practical to select large result sets.

I thought that such code

my $sth = $dbh->prepare('SELECT million_rows FROM table');
while (my @data = $sth->fetchrow) {
    # process the row
}

retrieves the result set row by row, without it being loaded to RAM. But I can't find any reference to this in DBI or MySQL docs. How is the result set really created and retrieved? Does it work the same for simple selects and joins?

like image 437
planetp Avatar asked Nov 12 '10 16:11

planetp


3 Answers

Your colleague is right.

By default, the perl module DBD::mysql uses mysql_store_result which does indeed read in all SELECT data and cache it in RAM. Unless you change that default, when you fetch row-by-row in DBI, it's just reading them out of that memory buffer.

This is usually what you want unless you have very very large result sets. Otherwise, until you get the last data back from mysqld, it has to hold that data ready and my understanding is that it causes blocks on writes to the same rows (blocks? tables?).

Keep in mind, modern machines have a lot of RAM. A million-row result set is usually not a big deal. Even if each row is quite large at 1 KB, that's only 1 GB RAM plus overhead.

If you're going to process millions of rows of BLOBs, maybe you do want mysql_use_result -- or you want to SELECT those rows in chunks with progressive uses of LIMIT x,y.

See mysql_use_result and mysql_store_result in perldoc DBD::mysql for details.

like image 131
Jamie McCarthy Avatar answered Sep 19 '22 18:09

Jamie McCarthy


This is not true (if we are talking about the database server itself, not client layers).

MySQL can buffer the whole resultset, but this is not necessarily done, and if done, not necessarily in RAM.

The resultset is buffered if you are using inline views (SELECT FROM (SELECT …)), the query needs to sort (which is shown as using filesort), or the plan requires creating a temporary table (which is shown as using temporary in the query plan).

Even if using temporary, MySQL only keeps the table in memory when its size does not exceed the limit set in tmp_table. When the table grows over this limit, it is converted from memory into MyISAM and stored on disk.

You, though, may explicitly instruct MySQL to buffer the resultset by appending SQL_BUFFER_RESULT instruction to the outermost SELECT.

See the docs for more detail.

like image 25
Quassnoi Avatar answered Sep 20 '22 18:09

Quassnoi


No, that is not how it works.

Database will not hold rows in RAM/swap.

However, it will try, and mysql tries hard here, to cache as much as possible (indexes, results, etc...). Your mysql configuration gives values for the available memory buffers for different kinds of caches (for different kinds of storage engines) - you should not allow this cache to swap.

Test it
Bottom line - it should be very easy to test this using client only (I don't know perl's dbi, it might, but I doubt it, be doing something that forces mysql to load everything on prepare). Anyway... test it:

If you actually issue a prepare on SELECT SQL_NO_CACHE million_rows FROM table and then fetch only few rows out of millions. You should then compare performance with SELECT SQL_NO_CACHE only_fetched_rows FROM table and see how that fares. If the performance is comparable (and fast) then I believe that you can call your colleague's bluff.

Also if you enable log of the statements actually issued to mysql and give us a transcript of that then we (non perl folks) can give more definitive answer on what would mysql do.

like image 31
Unreason Avatar answered Sep 16 '22 18:09

Unreason