What NoSQL DB to use for sparse Time Series like data?

Tags:

I'm planning a side project where I will be dealing with Time Series like data and would like to give one of those shiny new NoSQL DBs a try and am looking for a recommendation.

For a (growing) set of symbols I will have a list of (time,value) tuples (increasing over time). Not all symbols will be updated; some symbols may be updated while others may not, and completely new symbols may be added.

The database should therefore allow:

Add Symbols with initial one-element (tuple) list. E.g. A: [(2012-04-14 10:23, 50)]
Update Symbols with a new tuple. (Append that tuple to the list of that symbol).
Read the data for a given symbol. (Ideally even let me specify the time frame for which the data should be returned)

The create and update operations should possibly be atomic. If reading multiple symbols at once is possible, that would be interesting.

Performance is not critical. Updates/Creates will happen roughly once every few hours.

938

asked Apr 14 '12 22:04

angerman

1 Answers

I believe literally all the major NoSQL databases will support that requirement, especially if you don't actually have a large volume of data (which begs the question, why NoSQL?).

That said, I've had to recently design and work with a NoSQL database for time series data so can give some input on that design, which can then be extrapolated for all others.

Our chosen database was Cassandra, and our design was as follows:

A single keyspace for all 'symbols'
Each symbol was a new row
Each time entry was a new column for that relevant row
Each value (can be more than a single value) was the value part of the time entry

This lets you achieve everything you asked for, most notably to read the data for a single symbol, and using a range if necessary (column range calls). Although you said performance wasn't critical, it was for us and this was quite performant also - all data for any single symbol is by definition sorted (column name sort) and always stored on the same node (no cross node communication for simple queries). Finally, this design translates well to other NoSQL databases that have have dynamic columns.

Further to this, here's some information on using MongoDB (and capped collections if necessary) for a time series store: MongoDB as a Time Series Database

Finally, here's a discussion of SQL vs NoSQL for time series: https://dba.stackexchange.com/questions/7634/timeseries-sql-or-nosql

I can add to that discussion the following:

Learning curve for NoSQL will be higher, you don't get the added flexibility and functionality for free in terms of 'soft costs'. Who will be supporting this database operationally?
If you expect this functionality to grow in future (either as more fields to be added to each time entry, or much larger capacity in terms of number of symbols or size of symbol's time series), then definitely go with NoSQL. The flexibility benefit is huge, and the scalability you get (with the above design) on both the 'per symbol' and 'number of symbols' basis is almost unbounded (I say almost unbounded - maximum columns per row is in the billions, maximum rows per key space is unbounded I believe).

answered Nov 16 '22 01:11

yamen

Related questions
                            
                                Create database node.js with mongodb
                            
                                MongoDB join data inside an array of objects
                            
                                Mongoose how to listen for collection changes
                            
                                MongoDB - Why should I use a cursor instead of iterator_to_array (in PHP)
                            
                                Query projection with MongoDB 10gen driver
                            
                                Get notification for changed documents in mongodb
                            
                                Why is my mongoDB hosting uri settings for mongoid.yml not working correctly?
                            
                                Update an element in sub of sub array in mongodb
                            
                                What is the difference between toGMTstring() and toUTCstring()?
                            
                                Ask MongoDB if it is Master out of a bashscript
                            
                                how to update a mongodb document from node.js?
                            
                                Can't install mongodb doctrine in symfony2 with composer
                            
                                Update string to Date object in mongodb
                            
                                distinct with multiple fields and with where condition in mongodb
                            
                                mongodb - How to invert query with $not?
                            
                                MongoDb - Change type from Int to Double
                            
                                $in requires an array as a second argument, found: missing
                            
                                ubuntu: start (upstart) second instance of mongodb
                            
                                How to store results from dynamically generated forms in MongoDb?
                            
                                Implement autocomplete on MongoDB

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What NoSQL DB to use for sparse Time Series like data?

Tags:

mongodb

nosql

cassandra

riak

angerman

People also ask

1 Answers

yamen

Recent Activity

Donate For Us