Is there a way to multithread a SqlDataReader?

Tags:

I have a Sql Query which returns me over half million rows to process... The process doesn't take really long, but I would like to speed it up a little bit with some multiprocessing. Considering the code below, is it possible to multithread something like that easily?

using (SqlDataReader reader = command.ExecuteReader())
{
    while (reader.Read())
    {
        // ...process row
    }
}

It would be perfect if I could simply get a cursor at the beginning and in the middle of the list of results. That way, I could have two thread processing the records. However the SqlDataReader doesn't allow me to do that...

Any idea how I could achieve that?

812

asked May 27 '09 13:05

Martin

2 Answers

Set up a producer/consumer queue, with one producer process to pull from the reader and queue records as fast as it can, but do no "processing". Then some other number of processes (how many you want depends on your system) to dequeue and process each queued record.

164

answered Oct 11 '22 10:10

Joel Coehoorn

You shouldn't read that many rows on the client.

That being said, you can partition your query into multiple queries and execute them in parallel. That means launch multiple SqlCommands in separate threads and have them each churn a partition of the result. The A+ question is how to partition the result, and this depends largely o your data and your query:

You can use a range of keys (eg. ID betweem 1 and 10000, ID between 10001 and 20000 etc)
You can use an attribute (eg. RecordTypeID IN (1,2), RecordTypeID IN (3,4) etc)
You can use a synthetic range (ie. ROW_NUMBER() BETWEEN 1 and 1000 etC), but this is very problematic to pull of right
You can use a hash (eg. BINARY_CHECKSUM(*)%10 == 0, BINARY_CHECKSUM(*)%10==1 etc)

You just have to be very careful that the partition queries do no overlap and block during execution (ie. scan the same records and acquire X locks), thus serializing each other.

answered Oct 11 '22 12:10

Remus Rusanu

Related questions
                            
                                Pandas Vs SQL Speed
                            
                                SQL-style GROUP BY aggregate functions in jq (COUNT, SUM and etc)
                            
                                Write sql native query with left join and pagination in hibernate (springboot) [duplicate]
                            
                                MySQL aggregated sum of JSON objects
                            
                                Filtering within Postrgres aggregations
                            
                                How to drop columns in a compressed table?
                            
                                SQL partition by on date range
                            
                                Make an alias column from 2 different tables in MySQL
                            
                                How to return multiple tables as one XML?
                            
                                Number of records created per day
                            
                                Search by many optional parameters in spring data jpa repository
                            
                                How do I make and access regex capture groups in Django without RawSQL?
                            
                                ClickHouse: How to store JSON data the right way?
                            
                                Select products where the category belongs to any category in the hierarchy
                            
                                How can I run sql server stored procedures in parallel?
                            
                                Log changes made to all fields in a table to another table (SQL Server 2005)
                            
                                Select Parent Record With All Children in SQL
                            
                                How do I get Linq to SQL to recognize the result set of a dynamic Stored Procedure?
                            
                                Large volume database updates with an ORM
                            
                                Prevent Oracle minus statement from removing duplicates

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Is there a way to multithread a SqlDataReader?

Tags:

performance

sql

sql-server

multithreading

c#-3.0

Martin

People also ask

2 Answers

Joel Coehoorn

Remus Rusanu

Recent Activity

Donate For Us