AWS DynamoDB VS HBase

Tags:

I have been using HBase for the past six months and I came to know about DynamoDB by Amazon. Maintenance wise dynamo db looks easier to handle since its taken care by Amazon. But whether to switch to dynamo db from hbase is a question to me.

I could not find satisfying reason to switch from hbase to dynamo db except for maintaining the cluster.

Can somebody share the thoughts regarding this.

866

asked Jun 06 '12 05:06

dharshan

1 Answers

You have to essentially look for your requirements, DynamoDB provides great scalability and performance with minimal maintenance effort and an attractive financial cost. However, Apache HBase is much more flexible in terms of what you can store (size and data type wise).

Another very important point to evaluate is which data model , Column Wide or Key-Value, fits better your use cases.

More information in the link below: http://d0.awsstatic.com/whitepapers/AWS_Comparing_the_Use_of_DynamoDB_and_HBase_for_NoSQL.pdf

Here is a summary of the key points:

In summary, both Amazon DynamoDB and Apache HBase define data models that allow efficient storage of data to optimize query performance. Amazon DynamoDB imposes a restriction on its item size to allow efficient processing and reduce costs.

Apache HBase uses the concept of column families to provide data locality for more efficient read operations.

Amazon DynamoDB supports both scalar and multi-valued sets to accommodate a wide range of unstructured datasets. Similarly, Apache HBase stores its key/value pairs as arbitrary arrays of bytes, giving it the flexibility to store any data type.

Amazon DynamoDB supports built-in secondary indexes and automatically updates and synchronizes all indexes with their parent tables. With Apache HBase, you can implement and manage custom secondary indexes yourself.

From a data model perspective, you can choose Amazon DynamoDB if your item size is relatively small. Although Amazon DynamoDB provides a number of options to overcome row size restrictions, Apache HBase is better equipped to handle large complex payloads with minimal restrictions.

Throughput Model

Although read and write requirements are specified at table creation time, Amazon DynamoDB lets you increase or decrease the provisioned throughput to accommodate load with no downtime.

In Apache HBase, the number of nodes in a cluster can be driven by the required throughput for reads and/or writes.

Consistency Model

Amazon DynamoDB lets you specify the desired consistency characteristics for each read request within an application. You can specify whether a read is eventually consistent or strongly consistent.

The eventual consistency option is the default in Amazon DynamoDB and maximizes the read throughput. However, an eventually consistent read might not always reflect the results of a recently completed write. Consistency across all copies of data is usually eached within a second.

Apache HBase reads and writes are strongly consistent. This means that all reads and writes to a single row in Apache HBase are atomic. Each concurrent reader and writer can make safe assumptions about the state of a row. Multi-versioning and time stamping in Apache HBase contribute to its strongly consistent model.

Transaction Model

Neither Amazon DynamoDB nor Apache HBase support multi-item/cross-row or crosstable transactions due to performance considerations. However, both databases provide batch operations for reading and writing multiple items/rows across multiple tables with no transaction guarantees.

Table Operations

One key difference between the two databases is the flexible provisioned throughput model of Amazon DynamoDB. The ability to dial up capacity when you need it and dial it back down when you are done is useful for processing variable workloads with unpredictable peaks.

For workloads that need high update rates to perform data aggregations or maintain counters, Apache HBase is a good choice. This is because Apache HBase supports a multi-version concurrency control mechanism, which contributes to its strongly consistent reads and writes. Amazon DynamoDB gives you the flexibility to specify whether you want your read request to be eventually consistent or strongly consistent depending on your specific workload. reached within a second.

Source: http://d0.awsstatic.com/whitepapers/AWS_Comparing_the_Use_of_DynamoDB_and_HBase_for_NoSQL.pdf

answered Oct 24 '22 22:10

b-s-d

Related questions
                            
                                Is there a good library for accessing HBase from Python? [closed]
                            
                                Hbase Schema Nested Entity
                            
                                Using Phoenix with Cloudera Hbase (installed from repo)
                            
                                Hbase client can't connect to remote Hbase server
                            
                                Repair HBase table (unassigned region in transition)
                            
                                get "ERROR: Can't get master address from ZooKeeper; znode data == null" when using Hbase shell
                            
                                How to export data to text file in Apache phoenix?
                            
                                can not access HBase status UI on http://localhost:60010
                            
                                HBase & Mahout - Using HBase as a Datastore/source for Mahout - Classification
                            
                                hbase connection refused
                            
                                Is there a way to add nodes to a running Hadoop cluster?
                            
                                Using HBase to store time series data
                            
                                How to connect HBase and Spark using Python?
                            
                                HBase cassandra couchdb mongodb..any fundamental difference?
                            
                                how to get the row key from hbase scan result
                            
                                How to clear a table in hbase?
                            
                                ./bootstrap: 17: exec: autoreconf: not found : OpenTSDB installation
                            
                                HBase getting all timestamped values for a cell
                            
                                A script that deletes all tables in Hbase
                            
                                Why OpenTSDB chose HBase for Time Series data storage?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

AWS DynamoDB VS HBase

Tags:

amazon-dynamodb

hbase

dharshan

People also ask

1 Answers

b-s-d

Recent Activity

Donate For Us