Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are these tables too big for SQL Server or Oracle

I'm not much of a database guru so I would like some advice.

Background

We have 4 tables that are currently stored in Sybase IQ. We don't currently have any choice over this, we're basically stuck with what someone else decided for us. Sybase IQ is a column-oriented database that is perfect for a data warehouse. Unfortunately, my project needs to do a lot of transactional updating (we're more of an operational database) so I'm looking for more mainstream alternatives.

Question

  1. Given these tables' dimensions, would anyone consider SQL Server or Oracle to be a viable alternative?

    • Table 1 : 172 columns * 32 million rows
    • Table 2 : 453 columns * 7 million rows
    • Table 3 : 112 columns * 13 million rows
    • Table 4 : 147 columns * 2.5 million rows
  2. Given the size of data what are the things I should be concerned about in terms of database choice, server configuration, memory, platform, etc.?

like image 724
Jeffrey Cameron Avatar asked Nov 26 '09 15:11

Jeffrey Cameron


2 Answers

Yes, both should be able to handle your tables (if your server is suited for it). But, I would consider redesigning your database a bit. Even in a datawarehouse where you denormalize your data, a table with 453 columns is not normal.

like image 105
Maximilian Mayerl Avatar answered Sep 29 '22 17:09

Maximilian Mayerl


It really depends on what's in the columns. If there are lots of big VARCHAR columns -- and they are frequently filled to near capacity -- then you could be in for some problems. If it's all integer data then you should be fine.

453 * 4 = 1812      # columns are 4 byte integers, row size is ~1.8k
453 * 255 = 115,515 # columns are VARCHAR(255), theoretical row size is ~112k

The rule of thumb is that row size should not exceed the disk block size, which is generally 8k. As you can see, your big table is not a problem in this regard if it consists entirely of 4-byte integers but if it consists of 255-char VARCHAR columns then you could be exceeding the limit substantially. This 8k limit used to be a hard limit in SQL Server but I think these days it's just a soft limit and performance guideline.

Note that VARCHAR columns don't necessarily consume memory commensurate with the size you specify for them. That is the max size, but they only consume as much as they need. If the actual data in the VARCHAR columns is always 3-4 chars long then size will be similar to that of integer columns regardless of whether you created them as VARCHAR(4) or VARCHAR(255).

The general rule is that you want row size to be small so that there are many rows per disk block, this reduces the number of disk reads necessary to scan the table. Once you get above 8k you have two reads per row.

Oracle has another potential problem which is that ANSI joins have a hard limit on the total number of columns in all tables in the join. You can avoid this by avoiding the Oracle ANSI join syntax. (There are equivalents that don't suffer from this bug.) I don't recall what the limit is or which versions it applies to (I don't think it's been fixed yet).

The numbers of rows you're talking about should be no problem at all, presuming you have adequate hardware.

like image 45
Nate C-K Avatar answered Sep 29 '22 16:09

Nate C-K