Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best (NoSQL?) DB for small docs/records, unchanging data, lots of writes, quick reads?

Tags:

sql

nosql

I found a few questions in the same vein as this, but they did not include much detail on the nature of the data being stored, how it is queried, etc... so I thought this would be worthwhile to post.

My data is very simple, three fields: - a "datetimestamp" value (date/time) - two strings, "A" and "B", both < 20 chars

My application is very write-heavy (hundreds per second). All writes are new records; once inserted, the data is never modified.

Regular reads happen every few seconds, and are used to populate some near-real-time dashboards. I query against the date/time value and one of the string values. e.g. get all records where the datetimestamp is within a certain range and field "B" equals a specific search value. These queries typically return a few thousand records each.

Lastly, my database does not need to grow without limit; I would be looking at purging records that are 10+ days old either by manually deleting them or using a cache-expiry technique if the DB supported one.

I initially implemented this in MongoDB, without being aware of the way it handles locking (writes block reads). As I scale, my queries are taking longer and longer (30+ seconds now, even with proper indexing). Now with what I've learned, I believe that the large number of writes are starving out my reads.

I've read the kkovacs.eu post comparing various NoSQL options, and while I learned a lot I don't know if there is a clear winner for my use case. I would greatly appreciate a recommendation from someone familiar with the options.

Thanks in advance!

like image 307
Brad Gagne Avatar asked May 16 '12 00:05

Brad Gagne


People also ask

Is NoSQL good for writes?

Furthermore, NoSQL supports achieving high write throughput (by employing memory caches and append-only storage semantics), low read latencies (through caching and smart storage data models), and flexibility (with schema-less design and denormalization).

Is MongoDB the best NoSQL database?

Both MongoDB and DynamoDB are capable NoSQL products that offer the performance needed for mission-critical applications. MongoDB runs on multiple cloud platforms, scales horizontally across geographies, and has extensive performance monitoring capabilities.

Why is MongoDB the best NoSQL database?

MongoDB is better than other SQL databases because it allows a highly flexible and scalable document structure. For example: One data document in MongoDB can have five columns and the other one in the same collection can have ten columns.


1 Answers

I have faced a problem like this before in a system recording process control measurements. This was done with 5 MHz IBM PCs, so it is definitely possible. The use cases were more varied—summarization by minute, hour, eight-hour-shift, day, week, month, or year—so the system recorded all the raw data, but is also aggregated on the fly for the most common queries (which were five minute averages). In the case of your dashboard, it seems like five minute aggregation is also a major goal.

Maybe this could be solved by writing a pair of text files for each input stream: One with all the raw data; another with the multi-minute aggregation. The dashboard would ignore the raw data. A database could be used, of course, to do the same thing. But simplifying the application could mean no RDB is needed. Simpler to engineer and maintain, easier to fit on a microcontroller, embedded system, etc., or a more friendly neighbor on a shared host.

like image 127
wallyk Avatar answered Oct 08 '22 12:10

wallyk