Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unbuffered result set in MySQL golang driver

Tags:

mysql

go

I have a large query and I want to process the result row-by-row using the Go MySQL driver. The query is very simple but it returns a huge number of rows.

I have seen mysql_use_result() vs mysqli_store_result() at the C-API level. Is there an equivalent way to do an unbuffered query over a TCP connection, such as is used by the Go MySQL driver?

like image 821
chowey Avatar asked Mar 13 '15 02:03

chowey


1 Answers

This concept of buffered/unbuffered queries in database client libraries is a bit misleading, because actually, buffering may occur on multiple levels. In general (i.e. not Go specific, and not MySQL specific), you have different kinds of buffers.

  • TCP socket buffers. The kernel associates a communication buffer to each socket. By default, the size of this buffer is dynamic and controlled by kernel parameters. Some clients can change those defaults to get more control and optimize. Purpose if this buffer is to regulate the traffic in the device queues and eventually, decrease the number of packets on the network.

  • Communication buffers. Database oriented protocols are generally based on a framing protocol, meaning that frames are defined to separate the logical packets in the TCP stream. Socket buffers do not guarantee that a complete logical packet (a frame) is available for reading. Extra communication buffers are therefore required to make sure the frames are complete when they are processed. They can also help to reduce the number of system calls. These buffers are managed by the low-level communication mechanism of the database client library.

  • Rows buffers. Some database clients choose to keep all the rows read from the server in memory, and let the application code browse the corresponding data structures. For instance, the PostgreSQL C client (libpq) does it. The MySQL C client leaves the choice to the developer (by calling mysql_use_result or mysql_store_result).

Anyway, the Go driver you mention is not based on the MySQL C client (it is a pure Go driver). It uses only the two first kinds of buffers (sockets, and communication buffers). Row level buffering is not provided.

There is one communication buffer per MySQL connection. Its size is a multiple of 4 KB. It will grow dynamically if the frames are large. In the MySQL protocol, each row is sent as a separate packet (in a frame), so the size of the communication buffer is directly linked to the largest rows received/sent by the client.

The consequence is you can run a query returning a huge number of rows without saturating the memory, and still having good socket performance. With this driver, buffering is never a problem for the developer, whatever the query.

like image 175
Didier Spezia Avatar answered Oct 23 '22 19:10

Didier Spezia