If I have a table column with data and create an index on this column, will the index take same amount of disc space as the column itself?
I'm interested because I'm trying to understand if b-trees actually keep copies of column data in leaf nodes or they somehow point to it?
Sorry if this a "Will Java replace XML?" kind question.
UPDATE:
created a table without index with a single GUID column, added 1M rows - 26MB
same table with a primary key (clustered index) - 25MB (even less!), index size - 176KB
same table with a unique key (nonclustered index) - 26MB, index size - 27MB
So only nonclustered indexes take as much space as the data itself.
All measurements were done in SQL Server 2005
Does index take space in the disk? Explanation: Indexes take memory slots which are located on the disk.
When you perform index operations online, additional temporary disk space is required. If a clustered index is created, rebuilt, or dropped online, a temporary nonclustered index is created to map old bookmarks to new bookmarks.
Total disk space required to support both the source and target structures for the duration of the index operation is 816 MB (363 + 453).
In short, yes. Rebuilding indexes increases database file size. There are some nuances, but in general terms it is true. Both ONLINE or OFFLINE rebuild/reindexing operations increase file size.
The B-Tree points to the row in the table, but the B-Tree itself still takes some space on disk.
Some database, have special table which embed the main index and the data. In Oracle, it's called IOT -- index-organized table.
Each row in a regular table can be identified by an internal ID (but it's database specific) which is used by the B-Tree to identify the row. In Oracle, it's called rowid
and looks like AAAAECAABAAAAgiAAA
:)
If I have a table column with data and create an index on this column, will the index take same amount of disc space as the column itself?
In a basic B-Tree, you have the same number of node as the number of item in the column.
Consider 1,2,3,4
:
1
/
2
\ 3
\ 4
The exact space can still be a bit different (the index is probably a bit bigger as it need to store links between nodes, it may not be balanced perfectly, etc.), and I guess database can use optimization to compress part of the index. But the order of magnitude between the index and the column data should be the same.
I'm almost sure it's quite a DB dependent, but generally – yeah, they take additional space. This happens because of two reasons:
This way you can utilize the fact the data in BTREE leafs is sorted;
You gain lookup speed advantage as you don't have to seek back and forth to fetch neccessary stuff.
PS just checked our mysql server: for a 20GB table indexes take 10GB of space :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With