Recently, I started working on HBase (one of the column-oriented databases). While going through the source code, one question keeps popping in my head. Thought of asking this. My question is, how exactly a row-oriented database deals with information retrieval (say a select query) and how different is that when comes to column-oriented database. And how different these databases store data in the underlying flat files ( at the end of day, every database uses files).
Please do correct me in case I went wrong in any part of this question.
Regards, Krishna
row-oriented databases. Column oriented databases, under the hood, store all values from each column together whereas row oriented databases store all the values in a row together. A good way to visualize this is to think about it as the values of each column are stored in separate files.
A row is a horizontal alignment of data, while a column is vertical. Data in a row contains information that describes a single entity, while data in a column describes a field of information all entities possess.
A row store stores a sequence of records that contains the fields of one row in the table. In a column store, the entries of a column are stored in contiguous memory locations.
Columnar Database Examples Columnar databases that use CQL include Apache Cassandra, DataStax, Microsoft Azure Cosmos DB, and ScyllaDB, which is a native C++ rewrite of Cassandra. Other databases, such as Apache HBase, use their own query language.
If I understand you correctly, you are interested more in the underlying storage and retreival issues, and less in the DDL and definition issues, the categories of column-oriented dbs, right ?
I will assume you understand that virtually all storage, regardless of vendor, is some form of:
On top of that foundation, each vendor has optimisations, and patented specialisations. Eg. Sybase (row) has:
The next issue is that all the vendors (except oracle) have reasonably sophisticated engines, with modular design, and the I/O is handled asynchronously, at a low level, to obtain speed. The Unit of I/O is a Page. Usually 2 to 8KB for OLTP systems and 8 to 64KB for DSS. (notice I an avoiding the Row vs Column issue.) So regardless of row/column the DSS engines are build for mass retrieval, due to getting more Index/Data rows or columns in large blocks, with less I/O requests.
"Large I/O" can be performed by reading Extents (8 Pages) and larger AllocationUnits (256 Pages) into memory with one I/O request. But the basic unit is the Page.
Row vs Column
All queries executed against the engine, have to navigate indices, retrieve data rows/columns from the above data storage structures.
The result is a multiplication of the above;
small/large blocksize, times
underlying physical structures, times
Row/Column orientation
Is that what you were looking for ? There is a set of technical (not warm and fuzzy) diagrams of the above for Sybase ASE, a OLTP/DSS strictly row-oriented engine, which I can get my hands on, if you are interested.
.
you mean to say that eventually we will boil down to page irrespective of database type.
Yes.
If this is the case, then how clustering of a database will be done. Lets take a database that stores data in row mannered. If I am doing clustering for this type of databases, how exactly will the table structured be carried to different nodes ( if I have more than one node). Will this table structure linked to a page or will it be through a different mechanism.
You know, before I answer the question, I have to acknowledge you. For someone with your level of knowledge, it is excellent that you have penetrated down to that crucial point, obtained that insight. Shiva ki Jai!
Yes, that is the critical design problem of a clustered DBMS, the crucial limiting issue, above all the various design issues related to clustering; that if the vendor handles this issue well, the cluster works well; and if not, the cluster is a dogs breakfast.
Everything in IT is governed by the laws of physics. Nothing is free, every function of feature has a cost, processing or storage. There is no magic, except maybe in MS marketing brochures.
Good Clustered DB Architecture
I do not know all clustered DBMS; I know Sybase CE and Oracle RAC really well. Working knowledge of Sybase IQ.
Sybase CE is only one year old. But the architecture is brilliant, it handles this critical issue really well. There is only one version of the page on a SAN. All nodes are connected to the SAN. Any node can read or write the Page. The nodes are connected by a private LAN (in addition to the normal client-server LAN used by everything else on the network). The nodes co-ordinate the locks plus a bit of inter-node communication for laod-balancing, etc.
.
At the end of the day, for maximum concurrency, even with Sybase CE, you need to logically partition the databases, so that the workload on each node is separated, is accessing diffrent filepaths, or separate physical areas of the shared db.
Sybase IQ is already 100% column oriented. It is their DW offering. It already does full load balancing. It can be used is a cluster, but not clustering in the CE sense described above. I should have included it in
Poor Clustered DB Architecture
The dogs breakfast type of clustered dbms do stupid things. To list a few:
have the pages stored on every node[massive duplication], but then have to move the updated pages around the cluster
use MVCC to overcome the problem (but MVCC is far more overhead and actually slows the concurrency down, so it is fighting itself)
Cluster Unsuitable for Dedicated DB Server
Basically clusters are great for some applications, but it is a stupid idea for dedicated db servers (one fact in one place; shared resources which are administered together; lock contention which is most efficient when managed in one place because the data is in one place). I would never recommend a cluster for a db server.
Same as the SAN problem. Sure, many people have their db storage sitting inside a SAN, but for highest speed, and isolation from the load issues of other servers connected to the SAN, nothing comes close to local disk.
Same as the VMWare problem. Sure, many people have the db server established as a VMWare host unit, but for the highest speed, remove the overhead of VMWare; for isolation from the load issues of other host units in the chassis, get it out of there, onto a dedicated hard box.
Why Do DB Vendors Bother with a Cluster
Oh there is value in it, but not now, in the future. AFAIC, the Sybase architecture will predominate over time, and all others will fall by the wayside. Every vendor will copy it as usual.
The real power of Sybase CE is:
true 100% uptime (being able to add a node to the cluster and take the old node down for maintenance) and
fully dynamic load balancing (say the existing node is 4 x quad core; add a temp 4 x quad core node; take the old node down; insert 2 x quad core; bring it up; take down the temp node down) and then within 60 seconds, with no fingers on any keyboard, the whole beast rebalances.
A shop that can stagger the nightly db maintenance schedule of their several single-node servers, can save a fair amount of money; they just have a a couple of additional machines for switching in/out.
Data Warehouses are a little different. They are mostly read-only. So it is no problem to host it on a cluster (many reader nodes, only one writer node, no contention, no one cares that the pages are being written as they are being read). Sybase IQ is such a product.
Sybase CE for Column Oriented
Sybase IQ is already column oriented and can be deployed in a cluster, but not clustering in the CE sense described above. Columns are mapped to pages. I should have included it in Good Clustered Db Architecture above, corrected now.
I am not aware of hybrids combining column- and row-oriented that are worthwhile.
But the full answer to that question is, use a pure Db (not DW) such as Sybase ASE or ASE/CE, and implement a true Sixth Normal Form database. That is the ultimate Normalisation, the irreducible NF, with several substantial advantages, including speed and ease of pivoting. It provides column-oriented storage on the pages. Due to SQL not supporting 6NF fully, you will need to provide Views to supply 5NF rows from the (stored) 6NF structures. I wrote an extension to the catalogue, so that I could generate the SQL code for developers to use.
One problem with your question is that the long-standing database term "column oriented" has been appropriated (some might say "hijacked"!) by the NOSQL community to describe something completely different from what it originally meant. Both meanings of "column oriented" are still current but they refer to very different DBMS products. So it is often helplful to clarify which you are talking about. In this case, it's the NOSQL meaning of the term.
In the original meaning of a column oriented database the answer to your question is that there is no difference in the way you retrieve information. Column store isn't a different data model, it is simply a different type of representation in internal storage.
In the NOSQL community however, the term column store refers to a different kind of data model.
Good explanations here:
http://dbmsmusings.blogspot.com/2010/03/distinguishing-two-major-types-of_29.html
Row oriented databases, a.k.a. "traditional RDBMS" (such as MySQL, Oracle, DB2) use transactional secondary index updates, use B-Tree -like structures for secondary indices in most cases
Column oriented databases, a.k.a. "NoSQL" (such as Google Big Table, HBase, Cassandra) use simplified structures for primary key indices (which are not B-Tree)
Column oriented databases do not support "traditional" transactional secondary indices. It is responsibility of a user to maintain "inverted index".
Cassandra supports B-Tree -like index for a row: each cell in a row has a title, and cells are physically sorted by title.
Another (possibly super important) difference: for zillions records in Oracle, you will need to maintain B-Tree for primary key, and it's size will be also zillions-like; performance of "find by primary key" is not good.
On the other hand, you can have "wide rows" in Cassandra or HBase, and unite similar "cells" into single wide row; size of "primary key index" becomes millions times smaller, and "find by primary key" is super fast (and it is not B-Tree; it is clustered search)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With