If I have a use case for both huge data storage and searchability, why would I ever choose Google Cloud Bigtable over Google Cloud Datastore?
I've seen a few questions on SO and other sides "comparing" Bigtable and Datastore, but it seems to boil down to the same non-specific answers.
Here's my current knowledge and my thoughts:
Datastore is more expensive.
In the context of this question, let's forget entirely about pricing.
Bigtable is good for huge datasets.
It seems like Datastore is, too? I'm not seeing what specifically makes Bigtable objectively superior here.
Bigtable is better than Datastore for analytics.
How? Why? It seems like I can do analytics in Datastore as well, no problem. Why is Bigtable seemingly the unanimous decision industry-wide for analytics? What value do GMail, eBay, etc. get from Bigtable that Datastore can't provide?
Bigtable is integrated with Hadoop, Spark, etc.
Is Datastore not as well, considering it's built on Bigtable?
From this question, this statement was made in an answer:
Bigtable and Datastore are extremely different. Yes, the datastore is build on top of Bigtable, but that does not make it anything like it. That is kind of like saying a car is build on top of [car] wheels, and so a car is not much different from wheels.
However, this seems analogy seems nonsensical, since the car (including the wheels) intrinsically provides more value than just the wheels of a car by themselves.
It seems at first glance that Bigtable is strictly worse than Datastore, only providing a single index and limiting quick searchability. What am I missing?
Cloud Datastore—a document database built for automatic scaling, high performance, and ease of use. Cloud Bigtable—an alternative to HBase, a columnar database system running on HDFS. Suitable for high throughput applications.
Bigtable is ideal for applications that need high throughput and scalability for key/value data, where each value is typically no larger than 10 MB. Bigtable also excels as a storage engine for batch MapReduce operations, stream processing/analytics, and machine-learning applications.
You are charged each hour for the maximum number of nodes that exist during that hour, multiplied by the hourly rate. Bigtable bills a minimum of one hour for each node you provision. Node charges are for provisioned resources, regardless of node usage. Charges apply even if your cluster is inactive.
Bigtable is a NoSQL wide-column database optimized for heavy reads and writes. On the other hand, BigQuery is an enterprise data warehouse for large amounts of relational structured data.
Bigtable and Datastore are optimized for slightly different use-cases, and offer different tradeoffs. The main ones are:
Data model:
Cost model:
In general, Bigtable is a good choice if you need:
Example use-cases that are great for Bigtable include time series data (for IoT, monitoring, and more - think extremely write heavy workloads and massive amounts of data generated over x units of time), analytics (think fraud detection, personalization, recommendations), and ad-serving (every microsecond counts).
Datastore (or Firestore) is a good choice if you need:
Example use-cases include mobile and web applications, game state, user profiles, and product catalogs.
To answer a few of your questions explicitly:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With