key-value store for time series data?

Tags:

I've been using SQL Server to store historical time series data for a couple hundred thousand objects, observed about 100 times per day. I'm finding that queries (give me all values for object XYZ between time t1 and time t2) are too slow (for my needs, slow is more then a second). I'm indexing by timestamp and object ID.

I've entertained the thought of using somethings a key-value store like MongoDB instead, but I'm not sure if this is an "appropriate" use of this sort of thing, and I couldn't find any mentions of using such a database for time series data. ideally, I'd be able to do the following queries:

retrieve all the data for object XYZ between time t1 and time t2
do the above, but return one date point per day (first, last, closed to time t...)
retrieve all data for all objects for a particular timestamp

the data should be ordered, and ideally it should be fast to write new data as well as update existing data.

it seems like my desire to query by object ID as well as by timestamp might necessitate having two copies of the database indexed in different ways to get optimal performance...anyone have any experience building a system like this, with a key-value store, or HDF5, or something else? or is this totally doable in SQL Server and I'm just not doing it right?

901

asked Nov 05 '09 21:11

toasteroven

1 Answers

It sounds like MongoDB would be a very good fit. Updates and inserts are super fast, so you might want to create a document for every event, such as:

Click to copy

{
   object: XYZ,
   ts : new Date()
}

Then you can index the ts field and queries will also be fast. (By the way, you can create multiple indexes on a single database.)

How to do your three queries:

retrieve all the data for object XYZ between time t1 and time t2

Click to copy

db.data.find({object : XYZ, ts : {$gt : t1, $lt : t2}})

do the above, but return one date point per day (first, last, closed to time t...)

Click to copy

// first
db.data.find({object : XYZ, ts : {$gt : new Date(/* start of day */)}}).sort({ts : 1}).limit(1)
// last
db.data.find({object : XYZ, ts : {$lt : new Date(/* end of day */)}}).sort({ts : -1}).limit(1)

For closest to some time, you'd probably need a custom JavaScript function, but it's doable.

retrieve all data for all objects for a particular timestamp

Click to copy

db.data.find({ts : timestamp})

Feel free to ask on the user list if you have any questions, someone else might be able to think of an easier way of getting closest-to-a-time events.

113

answered Nov 13 '22 08:11

kristina

Related questions
                            
                                What can an RDBMS do that Neo4j (and graph databases) cant?
                            
                                Odd Database Design, Need Guidance
                            
                                How Do I Deep Copy a Set of Data, and Change FK References to Point to All the Copies?
                            
                                Low-latency Key-Value Store for SSD
                            
                                How to avoid "Violation of UNIQUE KEY constraint" when doing LOTS of concurrent INSERTs
                            
                                How to use Data aware controls "correctly"?
                            
                                Oracle materialized view error: code included
                            
                                Read and write to an access database using Javascript
                            
                                How to merge and synchronize SQL Server Database Files?
                            
                                Zero downtime (or near zero) db schema changes
                            
                                Do I need multiple cursor objects to loop over a recordset and update at the same time?
                            
                                database versioning
                            
                                Do missing foreign keys in the database have an effect on sql generated by EF?
                            
                                Getting data from multiple databases with same tablenames in django
                            
                                Profiling Mnesia Queries
                            
                                Writing a single column of R object into pre-existing postgres db table
                            
                                Best way to store comments with mentions (@FirstName) in database
                            
                                When to use multiple databases vs multiple tables
                            
                                Database services on new vps
                            
                                iPhone SQLite DB and Web-based DB synchronization and interaction recommendations

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

key-value store for time series data?

Tags:

database

time-series

toasteroven

People also ask

1 Answers

kristina

Recent Activity

Donate For Us