I am learning the Column Store index (Denali CTP3 feature) and got to know that it uses VertiPaq architecture for data compression. I got interested to know as what it is , how it works, it's architecture. I checked in google but no satisfactory result. Could any one please let me know in detail as what it is , how it works , the algo/architecture behind this etc.
And how it helps in data compression
I wrote a blog post on this that hopefully will answer your questions on column store indexes: http://www.jamesserra.com/archive/2011/08/sql-server-%e2%80%9cdenali%e2%80%9d-project-apollo/
Please let me know if you still have questions.
And how it helps in data compression
The compression part works so well because very often data in the same column doesn't vary much. Imagine for example (simplification) a column that stores values from a multiple (4) choice input. There are going to be just 4 unique values in the column store, even if there are 8 million records in the table. That makes the column values easier to compress. That in turn makes it easier to fit the index into memory and thus faster to query.
When data is stored in column-wise fashion, the data can often be compressed more effectively than when stored in row-wise fashion. Typically there is more redundancy within a column than within a row, which usually means the data can be compressed to a greater degree. When data is more compressed, less IO is required to fetch the data into memory. In addition, a larger fraction of the data can reside in a given size of memory. Reducing IO can significantly speed up query response time. Retaining more of your working set of data in memory will speed up response time for subsequent queries that access the same data.
Source: More details on columnstore technology
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With