Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"Streaming" read of over 10 million rows from a table in SQL Server

What is the best strategy to read millions of records from a table (in SQL Server 2012, BI instance), in a streaming fashion (like SQL Server Management Studio does)?

I need to cache these records locally (C# console application) for further processing.

Update - Sample code that works with SqlDataReader

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Data;
using System.Data.SqlClient;
using System.Threading;


namespace ReadMillionsOfRows
{
    class Program
    {
        static ManualResetEvent done = new ManualResetEvent(false);


        static void Main(string[] args)
        {

          Process();
          done.WaitOne();
        }

        public static async Task Process()
        {
            string connString = @"Server=;Database=;User Id=;Password=;Asynchronous Processing=True";
            string sql = "Select * from tab_abc";

            using (SqlConnection conn = new SqlConnection(connString))
            {
                await conn.OpenAsync();
                using (SqlCommand comm = new SqlCommand(sql))
                {
                    comm.Connection = conn;
                    comm.CommandType = CommandType.Text;

                    using (SqlDataReader reader = await comm.ExecuteReaderAsync())
                    {
                        while (await reader.ReadAsync())
                        {
                            //process it here
                        }
                    }
                }
            }

            done.Set();
        }

    }
}
like image 900
Ram Sundar Avatar asked Jan 16 '23 05:01

Ram Sundar


2 Answers

Use a SqlDataReader it is forward only and fast. It will only hold a reference to a record while it is in the scope of reading it.

like image 88
awright18 Avatar answered Feb 04 '23 16:02

awright18


That depends on what your cache looks like. If you're going to store everything in memory, and a DataSet is approriate as a cache, just read everything to the DataSet.

If not, use the SqlDataReader as suggested above, read the records one by one storing them in your big cache.

Do note, however, that there's already a very popular caching mechanism for large database tables - your database. With the proper index configuration, the database can probably outperform your cache.

like image 45
zmbq Avatar answered Feb 04 '23 17:02

zmbq