Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Price aside, why ever choose Google Cloud Bigtable over Google Cloud Datastore?

Tags:

If I have a use case for both huge data storage and searchability, why would I ever choose Google Cloud Bigtable over Google Cloud Datastore?

I've seen a few questions on SO and other sides "comparing" Bigtable and Datastore, but it seems to boil down to the same non-specific answers.

Here's my current knowledge and my thoughts:

Datastore is more expensive.

In the context of this question, let's forget entirely about pricing.

Bigtable is good for huge datasets.

It seems like Datastore is, too? I'm not seeing what specifically makes Bigtable objectively superior here.

Bigtable is better than Datastore for analytics.

How? Why? It seems like I can do analytics in Datastore as well, no problem. Why is Bigtable seemingly the unanimous decision industry-wide for analytics? What value do GMail, eBay, etc. get from Bigtable that Datastore can't provide?

Bigtable is integrated with Hadoop, Spark, etc.

Is Datastore not as well, considering it's built on Bigtable?

From this question, this statement was made in an answer:

Bigtable and Datastore are extremely different. Yes, the datastore is build on top of Bigtable, but that does not make it anything like it. That is kind of like saying a car is build on top of [car] wheels, and so a car is not much different from wheels.

However, this seems analogy seems nonsensical, since the car (including the wheels) intrinsically provides more value than just the wheels of a car by themselves.

It seems at first glance that Bigtable is strictly worse than Datastore, only providing a single index and limiting quick searchability. What am I missing?

like image 754
zeBugMan Avatar asked Nov 26 '18 21:11

zeBugMan


People also ask

What is the difference between cloud Datastore and Cloud Bigtable?

Cloud Datastore—a document database built for automatic scaling, high performance, and ease of use. Cloud Bigtable—an alternative to HBase, a columnar database system running on HDFS. Suitable for high throughput applications.

When should I use Bigtable?

Bigtable is ideal for applications that need high throughput and scalability for key/value data, where each value is typically no larger than 10 MB. Bigtable also excels as a storage engine for batch MapReduce operations, stream processing/analytics, and machine-learning applications.

Is Google Bigtable free?

You are charged each hour for the maximum number of nodes that exist during that hour, multiplied by the hourly rate. Bigtable bills a minimum of one hour for each node you provision. Node charges are for provisioned resources, regardless of node usage. Charges apply even if your cluster is inactive.

What is the difference between Bigtable and BigQuery in GCP?

Bigtable is a NoSQL wide-column database optimized for heavy reads and writes. On the other hand, BigQuery is an enterprise data warehouse for large amounts of relational structured data.


1 Answers

Bigtable and Datastore are optimized for slightly different use-cases, and offer different tradeoffs. The main ones are:

Data model:

  • Bigtable is a wide-column database -- think HBase and Cassandra
  • Datastore is a document database -- think MongoDB
  • Note that both of these can be used for key-value use cases

Cost model:

  • Bigtable charges per provisioned nodes
  • Datastore is serverless and charges per operation

In general, Bigtable is a good choice if you need:

  • Fast point-reads and range scans (especially at scale). Bigtable will offer lower latency for key-value lookups, as well as fast scans of contiguous rows - a powerful tool since rows are stored in lexicographic order. If you have simple, predictable query patterns and design your schema well, reading from Bigtable can be incredibly efficient.
  • High throughput writes (again, especially at scale). This is possible in part because Bigtable is eventually consistent - in exchange you can see big wins in price/performance.

Example use-cases that are great for Bigtable include time series data (for IoT, monitoring, and more - think extremely write heavy workloads and massive amounts of data generated over x units of time), analytics (think fraud detection, personalization, recommendations), and ad-serving (every microsecond counts).

Datastore (or Firestore) is a good choice if you need:

  • Query flexibility: Datastore offers document support and secondary indexes.
  • Strong consistency and/or transactions: Bigtable has eventually consistent replication and does not support multi-row transactions.
  • Mobile SDKs: Datastore and Firestore are incredibly well-integrated with firebase ecosystem.

Example use-cases include mobile and web applications, game state, user profiles, and product catalogs.

To answer a few of your questions explicitly:

  • Why is Bigtable used for analytics? It's mostly about performance: analytics use-cases are more likely to have large datasets and require high write throughput. It's a lot easier to run into the limits of a database if you're storing clickstream data, as opposed to something like user account information. Fast scans are also important for analytics use-cases: Bigtable allows you to retrieve all of the information you need about a user or a device extremely quickly, which you can process in a batch job or use to create recommendations and analysis on the fly.
  • Is Bigtable strictly worse than Datastore? Datastore definitely provides more built-in functionality like secondary indexes and document support, and if you need those features, Datastore is a fantastic choice. But that functionality comes with tradeoffs. Bigtable provides perhaps lower-level, but incredibly performant APIs that allow users to make those tradeoffs for themselves: If a user values, say, write performance over secondary indexes, Bigtable is an excellent option. You can think of it as an extremely versatile and powerful infrastructural building block. I actually like the wheel/car analogy: sometimes you don't want the car -- if what you really need is a dirt bike, a set of solid wheels is much more useful :)
like image 74
Sandy Ghai Avatar answered Oct 22 '22 01:10

Sandy Ghai