Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does the storage backend influence Datomic?

How should I pick the backend storage service for Datomic?

Is it a matter of preference to select, say, DynamoDB instead of Postgres, or does each option have different tradeoffs? If so, what are they?

like image 676
konr Avatar asked Jul 29 '13 03:07

konr


1 Answers

Storage Services Requirements

Datomic' storage services should generally meet 3 requirements:

  1. Implement key-value store semantics: efficient read/write access using indexed keys’ values
  2. Support consistent reads. e.g. read your own writes. Ideally, no-contention/lock-free reads.
  3. Support conditional puts. e.g. optimistic locking + snapshot isolation.

Datomic uses storages services to store blocks of sorted, compressed datoms, similar to the way traditional database systems use file systems and the requirements above are pretty much the API between the underlying storage service and Datomic. So the choice in storage services depend on how well they support those three requirements.

Write Scalability

Datomic doesn't usually put a lot of write pressure on the underlying storage service since there's only one component writing to it, the Transactor. Also, Datomic uses a background indexing job to integrate novelty into storage once enough of it has been accumulated (by default ~32MB but can be configured) which further reduces the constant write load. The only thing Datomic immediately writes is the transaction log.

Read Scalability

Datomic uses multiple layers of caching i.e. memcached and peers cache so in ideal circumstances i.e. when the working set fits in memory, the systems won't put a lot o read pressure either.

System Load

If your system doesn't require huge write scalability and your application data tends to fit in memory, then the choice of a particular storage service is irrelevant except, of course, for their operational capabilities (backups, admin tools, etc.) which have nothing to do with Datomic.

If, on the other hand, you system does require huge write scalability or you have a great number of peers, each of them working with more data than can fit in their memory (forcing a lot of data segments to be brought from storage), you'll require a storage system that can horizontally scale e.g. DynamoDB. As mentioned in one of the comments, if you need arbitrary write scalability, Datomic is not the right system for you anyway.

like image 132
a2ndrade Avatar answered Nov 01 '22 23:11

a2ndrade