Database that can handle >500 millions rows

Tags:

I am looking for a database that could handle (create an index on a column in a reasonable time and provide results for select queries in less than 3 sec) more than 500 millions rows. Would Postgresql or Msql on low end machine (Core 2 CPU 6600, 4GB, 64 bit system, Windows VISTA) handle such a large number of rows?

Update: Asking this question, I am looking for information which database I should use on a low end machine in order to provide results to select questions with one or two fields specified in where clause. No joins. I need to create indices -- it can not take ages like on mysql -- to achieve sufficient performance for my select queries. This machine is a test PC to perform an experiment.

The table schema:

 create table mapper {         key VARCHAR(1000),         attr1 VARCHAR (100),         attr1 INT,         attr2 INT,         value VARCHAR (2000),         PRIMARY KEY (key),         INDEX (attr1),          INDEX (attr2)        }

558

asked Sep 23 '10 13:09

Skarab

1 Answers

MSSQL can handle that many rows just fine. The query time is completely dependent on a lot more factors than just simple row count.

For example, it's going to depend on:

how many joins those queries do
how well your indexes are set up
how much ram is in the machine
speed and number of processors
type and spindle speed of hard drives
size of the row/amount of data returned in the query
Network interface speed / latency

It's very easy to have a small (less than 10,000 rows) table which would take a couple minutes to execute a query against. For example, using lots of joins, functions in the where clause, and zero indexes on a Atom processor with 512MB of total ram. ;)

It takes a bit more work to make sure all of your indexes and foreign key relationships are good, that your queries are optimized to eliminate needless function calls and only return the data you actually need. Also, you'll need fast hardware.

It all boils down to how much money you want to spend, the quality of the dev team, and the size of the data rows you are dealing with.

UPDATE Updating due to changes in the question.

The amount of information here is still not enough to give a real world answer. You are going to just have to test it and adjust your database design and hardware as necessary.

For example, I could very easily have 1 billion rows in a table on a machine with those specs and run a "select top(1) id from tableA (nolock)" query and get an answer in milliseconds. By the same token, you can execute a "select * from tablea" query and it take a while because although the query executed quickly, transferring all of that data across the wire takes awhile.

Point is, you have to test. Which means, setting up the server, creating some of your tables, and populating them. Then you have to go through performance tuning to get your queries and indexes right. As part of the performance tuning you're going to uncover not only how the queries need to be restructured but also exactly what parts of the machine might need to be replaced (ie: disk, more ram, cpu, etc) based on the lock and wait types.

I'd highly recommend you hire (or contract) one or two DBAs to do this for you.

answered Oct 12 '22 23:10

NotMe

Related questions
                            
                                SqlServer is in script upgrade mode
                            
                                Arithmetic overflow error when summing an INT, how do I cast it as a BIGINT?
                            
                                How to enable bulk permission in SQL Server
                            
                                Select count(*) from result query
                            
                                Stored procedure - return identity as output parameter or scalar
                            
                                How do I add a “last modified” and "created" column in a SQL Server table?
                            
                                Is 'LEFT OUTER JOIN' equivalent to 'JOIN' in Microsoft SQL
                            
                                Reference Microsoft.SqlServer.Smo.dll
                            
                                Database-wide unique-yet-simple identifiers in SQL Server
                            
                                finding max possible date in ms sql server 2005+
                            
                                SQL list of all the user defined functions in a database
                            
                                Select group of rows that match all items in a list
                            
                                SQL Script to alter ALL Foreign Keys to add ON DELETE CASCADE
                            
                                Add non-nullable columns to an existing table in SQL server?
                            
                                How to set a default row for a query that returns no rows?
                            
                                Update with two tables?
                            
                                Override file while backup database
                            
                                How do you strip a character out of a column in SQL Server?
                            
                                Does UNION ALL guarantee the order of the result set [duplicate]
                            
                                Deleting a SQL row ignoring all foreign keys and constraints

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Database that can handle >500 millions rows

Tags:

database

sql-server

postgresql

Skarab

People also ask

1 Answers

NotMe

Recent Activity

Donate For Us