Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the Fastest Way to Select a Whole Table in SQL Server?

I am writing a app that reads a whole table, does some processing, then writes the resulting data to another table. I am using the SqlBulkCopy class (.net version of "bcp in") which does the insert very fast. But I cannot find any efficent way to select data in the first place. there is not .net equivilent of "bcp out", which seems strange to me.

Currently I'm using select * from table_name. For prespective it takes 2.5 seconds to select 6,000 rows ... and only 600ms to bulk insert the same number of rows.

I would expect that selecting data should always be faster than inserting. What is the fastest way to select all rows & columns from a table?


Answers to qeustions:

  • I timed my select to take 2.5 seconds 2 ways. First was while running my application and running a sql trace. second was running the same query in SSMS. Both retured about the same result.
  • I am reading data using SqlDataReader.
  • No other applications are using this database.
  • My current processing takes under 1 second, so 2+ second read time is relatively large. But mostly I'm concerned(interested) in performance when scaling this up to 100,000 rows and millions of rows.
  • Sql Server 08r2 and my application are both running on my dev machine.
  • Some of the data processing is set based so I need to have the whole table in memory (to support much larger data sets, I know this step will probably need to be moved into SQL so I only need to operate per row in memory)

Here is my code:

DataTable staging = new DataTable();
using (SqlConnection dwConn = (SqlConnection)SqlConnectionManager.Instance.GetDefaultConnection())
{
    dwConn.Open();
    SqlCommand cmd = dwConn.CreateCommand();
    cmd.CommandText = "select * from staging_table";

    SqlDataReader reader = cmd.ExecuteReader();
    staging.Load(reader);
}
like image 739
TheSean Avatar asked Mar 10 '11 13:03

TheSean


2 Answers

select * from table_name is the simplest, easiest and fastest way to read a whole table.

Let me explain why your results lead to wrong conclusions.

  1. Copying a whole table is an optimized operation that merely requires cloning the old binary data into the new one (at most you can perform a file copy operation, according to storage mechanism).
  2. Writing is buffered. DBMS says the record was written but it's actually not yet done, unless you work with transactions. Disk operations are generally delayed.
  3. Querying a table also requires (unlike cloning) adapting data from the binary-stored layout/format to a driver-dependant format that is ultimately readable by your client. This takes time.
like image 169
usr-local-ΕΨΗΕΛΩΝ Avatar answered Oct 20 '22 15:10

usr-local-ΕΨΗΕΛΩΝ


It all depends on your hardware, but it is likely that your network is the bottleneck here.

Apart from limiting your query to just read the columns you'd actually be using, doing a select is as fast as it will get. There is caching involved here, when you execute it twice in a row, the second time shoud be much faster because the data is cached in memory. execute dbcc dropcleanbuffers to check the effect of caching.

If you want to do it as fast as possible try to implement the code that does the processing in T-SQL, that way it could operate directly on the data right there on the server.

Another good tip for speed tuning is have the table that is being read on one disk (look at filegroups) and the table that is written to on another disk. That way one disk can do a continuous read and the other a continuous write. If both operations happen on the same disk the heads of the disk keep going back and forth what seriously downgrades performance.

If the logic your writing cannot be doen it T-SQL you could also have a look at SQL CLR.

Another tip: when you do select * from table, use a datareader if possible. That way you don't materialize the whole thing in memory first.

GJ

like image 31
gjvdkamp Avatar answered Oct 20 '22 17:10

gjvdkamp