Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can Datomic users cope without composite indexes?

In Datomic, how do you efficiently perform queries such as 'find all people living in Washington older than 50' (city and age may vary)? In relational databases and most of NoSQL databases you use composite indexes for this purpose; Datomic, as far as I'm aware of, does not support anything like this.

I built several, say, medium-sized web-apps and not a single one would perform quick enough, if not for composite indexes. How are Datomic users dealing with this? Or are they just playing with datasets small enough not to suffer from this? Am I missing something?

like image 761
Tomas Kulich Avatar asked Jul 03 '14 19:07

Tomas Kulich


People also ask

When should you use composite index?

If composite columns needs to be unique, then use composite index and set it to be unique. A composite index will also be faster if one of the column has many duplicate rows. However, composite index will only be used when both columns are included in a query.

Are composite indexes slower?

Guidelines for determining composite indexes Like single indexes, composite indexes also come with the cost of slower write speeds and increased storage space.

How does a composite index work?

A composite index is a statistical tool that groups together many different equities, securities, or indexes in order to create a representation of overall market or sector performance. Typically, the elements of a composite index are combined in a standardized way so that large amounts of data can be presented easily.


1 Answers

This problem and its solution are not identical in Datomic due to the structure of data (datoms) in Datomic. There are two performance characteristics/strategies that may add some shading to this:

(1) When you fetch data in Datomic, you fetch an entire leaf segment from the index tree (not an individual item) - with segments being composed of potentially many thousands of datoms. This is then cached automatically so that you don't have to reach out over the network to get more datoms.

If you're querying a single person - i.e., a single entity, for their age and where they live, it's very likely the query's navigation of the EAVT or AEVT indexes may have cached everything you need. You've effectively cached the datom, how to navigate to it to it, and related datoms (by locality in the index).

(2) Partitions can provide a manual means to specify locality of reference. Partitions impact the entity ID's value (it's encoded in the high bits) and ensure that related entities are sorted near each other. So for an alternative implementation of the above problem, if you needed information from the city and person entities both, you could include them in the same partition.

like image 170
Ben Kamphaus Avatar answered Oct 11 '22 17:10

Ben Kamphaus