Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Retrieving billions of rows from remote server?

I am trying to retrieve around 200 billion rows from a remote SQL Server. To optimize this, I have limited my query to use only an indexed column as a filter and am selecting only a subset of columns to make the query look like this:

SELECT ColA, ColB, ColC FROM <Database> WHERE RecordDate BETWEEN '' AND ''

But it looks like unless I limit my query to a time window of a few hours, the query fails in all cases with the following error:

OLE DB provider "SQLNCLI10" for linked server "<>" returned message "Query timeout expired".
Msg 7399, Level 16, State 1, Server M<, Line 1
The OLE DB provider "SQLNCLI10" for linked server "<>" reported an error. Execution terminated by the provider because a resource limit was reached.
Msg 7421, Level 16, State 2, Server <>, Line 1
Cannot fetch the rowset from OLE DB provider "SQLNCLI10" for linked server "<>". 

The timeout is probably an issue because of the time it takes to execute the query plan. As I do not have control over the server, I was wondering if there is a good way of retrieving this data beyond the simple SELECT I am using. Are there any SQL Server specific tricks that I can use? Perhaps tell the remote server to paginate the data instead of issuing multiple queries or something else? Any suggestions on how I could improve this?

like image 670
Legend Avatar asked Jul 28 '11 21:07

Legend


1 Answers

This is more of the kind of job SSIS is suited for. Even a simple flow like ReadFromOleDbSource->WriteToOleDbSource would handle this, creating the necessary batching for you.

like image 72
Remus Rusanu Avatar answered Sep 18 '22 14:09

Remus Rusanu