Huge table (9 million records) in SQL server

Tags:

windows-7

Our underwriting company just sent us a data extract of leads. There are 9 million rows. The rows consist of LeadID (guid), RawLeadXML (xml -- probably 3-4kb max), and a LeadStatusID (int).

I first tried to add an autonumber integer and make it the primary key of this table. Well, it added the field, but couldn't make it the primary key (There is insufficient memory available in the buffer pool.)

What I need to do is to take every record, 1 by 1, and get the XML, put it into an XmlDocument object in .Net, strip out all the fields I want (firstname, lastname, etc) and store that parsed information into another table.

I can't even get this statement to run: select * from Leads where id between 1 and 1000

If I just select 1000 records at a time (select top 1000 * from Leads), that works, but how would I then select the next 1000 records without some sort of reference point?

My machine has 4 processors (2.53Ghz) and 12 GB of RAM. It's not a server, but it's a beefy machine. I don't know what to try next, honestly.

EDIT: I left out that the original file was actually an MDF (and the associated LDF) file, so I just attached to them in SQL Server.

EDIT 2: I messed up and said that the RawLeadXML column was XML -- it isn't, it's just nvarchar(max). I honestly didn't know there WAS an xml datatype.

EDIT 3: I can't even issue a delete statement on this table: "delete from leads where leadid = '100a7927-5311-4f12-8fe3-95c079d32dd4' " blows up with:

Msg 802, Level 17, State 20, Line 2
There is insufficient memory available in the buffer pool.

I have no idea what to do next. How in the hell is this even a problem? There are thousands of databases in the world with more records than I've got.

EDIT 4: In case anyone cares, none of the below solutions worked. I'm sure this is a limitation of my machine, and definitely not a condemnation of the fine answers I received below. Currently, I am transferring the zipped DB (2.6 GB) to our servers in Rackspace, and then I will attempt to add an index on that hardware, hopefully without taking down our production servers. Once the index is added, I'm hopeful that I can zip up the db and bring it back down to my local machine, and then be able to actually do something with it.

EDIT 5: My machine was literally incapable of handling a table of this size. My machine has 12 GB RAM, 64 bit Windows 7 Professional, a quad core 2.53Ghz processor, SSD drive, etc. It's pretty beefy for a development machine. And it couldn't handle this.

So, I moved the DB to our server in Rackspace in London. 48 GB or memory in that one, and it was able to add the index I needed. Even after that, my machine here was unable to do anything all that useful with it, so I've written a .Net program that is running in London to put out the records 1000 at a time, parse them into another table, and then mark the original records as processed.

Once I do this, I'm going to have to leave the DB in London, because I doubt I could write any meaningful reports against this monstrosity locally. Which will make development fun.

Summary: I believe that there is no good way to process a dataset this large without using server class hardware with at least 48 GB of RAM (in my case).

825

asked Apr 26 '11 18:04

Matt Dawdy

1 Answers

Error 802 does not mean out of memory in the classical sense of insufficient memory to do an allocation (that would trigger error 701). Error 802 actually indicates that the buffer pool cannot grow, which can occur due to a couple of reasons:

max server memory setting explicitly prevents buffer pool growth, check your server setting.
x86 virtual address space limit is hit and AWE is not enabled. Check if you have an x86 (32 bit) instance and if yes, check if all the conditions to enable AWE are met.

If you still haven't found the problem, read How to use the DBCC MEMORYSTATUS command to monitor memory usage on SQL Server 2005 (the article applies equally well to SQL Server 2008 and to 2008 R2) and follow the guidelines there to understand who/what is consuming your memory.

118

answered Sep 28 '22 02:09

Remus Rusanu

Related questions
                            
                                Practice SQL Queries [closed]
                            
                                Is it faster to UPDATE a row, or to DELETE it and INSERT a new one?
                            
                                Use nested stored procedure results in calling stored procedure Sql Server 2008
                            
                                HTML Format in sp_send_dbmail
                            
                                Using Dapper with SQL Spatial Types as a parameter
                            
                                How to restrict a user to access objects of only one schema in SQL Server 2008?
                            
                                renaming a temporary table into a physical one
                            
                                SQL Server Set variable if exists else insert into table
                            
                                SQL Server Execute Impersonation
                            
                                DateTime.Now in C# to yield Datetime2(3) in SQL Server datatype
                            
                                SQL Server Management Studio error "Saved settings file cannot be found"
                            
                                What is the best way to fetch records batch-wise from SQL Server
                            
                                SQL where Date from today minus and plus days
                            
                                Table type parameter in a stored procedure cause operand type clash error
                            
                                nonclustered index on primary key column?
                            
                                How Stuff Works Or Logic of Stuff & XML Path in SQL
                            
                                How to count groups of rows and display the top/bottom 3
                            
                                SQL CTE and ORDER BY affecting result set
                            
                                Concatenate sql values to a variable
                            
                                Retrieve >901 rows from SQL Server 2008 linked server to Active Directory

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With