Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Handling large SQL select queries / Read sql data in chunks

I'm using .Net 4.0 and SQL server 2008 R2.

I'm running a big SQL select query which returns millions of results and takes up a long time to fully run.

Does anyone know how can I read only some of the results returned by the query without having to wait for the whole query to complete?

In other words, I want to read the first by 10,000 records chunks while the query still runs and getting the next results.

like image 529
Omri Avatar asked Apr 20 '11 06:04

Omri


People also ask

How do you handle a large amount of data in SQL?

The most recommended and best option is to have a STANDBY server, restore the backup of the production database on that server, and then run the DBCC command. If the consistency checks run ok on the standby database, the production database should be ok as it is the source of the standby.

How fetch data from database in chunks?

To retrieve large select by chunks, you need to use ORDER BY LIMIT. The syntax is as follows: SELECT *FROM yourTableName ORDER BY yourColumnName LIMIT 0,10; From the above syntax, you will get 10 rows from the table.


2 Answers

It depends in part on whether the query itself is streaming, or whether it does lots of work in temporary tables then (finally) starts returning data. You can't do much in the second scenario except re-write the query; however, in the first case an iterator block would usually help, i.e.

public IEnumerable<Foo> GetData() {
     // not shown; building command etc
     using(var reader = cmd.ExecuteReader()) {
         while(reader.Read()) {
             Foo foo = // not shown; materialize Foo from reader
             yield return foo;
         }
     }
}

This is now a streaming iterator - you can foreach over it and it will retrieve records live from the incoming TDS data without buffering all the data first.

If you (perhaps wisely) don't want to write your own materialization code, there are tools that will do this for you - for example, LINQ-to-SQL's ExecuteQuery<T>(tsql, args) will do the above pain-free.

like image 119
Marc Gravell Avatar answered Nov 03 '22 13:11

Marc Gravell


You'd need to use data paging.

SQL Server has the TOP clause (SQL TOP 10 a,b,c from d) and BETWEEN:

SELECT TOP 10000 a,b,c from d BETWEEN X and Y

Having this, I guess you'd be able of retrieving an N number of rows, do some partial processing, then load next N number of rows and so on.

This can be achieved by implementing a multithreaded solution: one will be retrieving results while the other will asynchronously wait for data and it'll be doing some processing.

like image 2
Matías Fidemraizer Avatar answered Nov 03 '22 15:11

Matías Fidemraizer