Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

NoSQL databases - good candidates for log processing/aggregation and rollup? [closed]

I have a MS SQL database that's used to capture bandwidth stats. We have a raw data table and to improve reporting speed at different drill-down levels we aggregate and rollup data on an hourly, daily and weekly basis to separate tables.

Would a NoSQL database such as Mongo or Raven be a good candidate for this type of application?

like image 983
Kev Avatar asked Jan 20 '11 11:01

Kev


People also ask

Is NoSQL good for aggregation?

Aggregate Data Models in NoSQL make it easier for the Databases to manage data storage over the clusters as the aggregate data or unit can now reside on any of the machines. Whenever data is retrieved from the Database all the data comes along with the Aggregate Data Models in NoSQL.

What kind of data is a good fit for an NoSQL database?

NoSQL databases are often better suited to storing and modeling structured, semi-structured, and unstructured data in one database.

What are NoSQL models good at?

NoSQL databases use a variety of data models for accessing and managing data. These types of databases are optimized specifically for applications that require large data volume, low latency, and flexible data models, which are achieved by relaxing some of the data consistency restrictions of other databases.


1 Answers

Different NoSQL solutions solve different problems for different uses - so first off the best thing to do is look at your problem and break it down

  • You are writing heavily to storage, therefore write speed is important to you
  • You want to perform aggregation operations on that data and have the results of that easily queryable
  • Read speed isn't that important from the sound of things, at least not in an "web application has to be really responsive for millions of people" kind of way
  • I don't know if you need dynamic queries or not

Let's look at Couch, Mongo and Raven in a very high level, generalised way

Raven

  • Fast writes
  • Fast queries (eventually consistent, pre-computed, aggregation via map/reduce)
  • Dynamic queries possible, but not really appropriate to your use case, as you're most likely going to be querying by date etc

Mongo

  • Blindingly Fast writes (In my opinion dangerously, because power going off means losing data ;-))
  • Slow reads (relatively), aggregation via map/reduce, not pre-computed
  • Dynamic queries are just what_you_do, but you probably have to define indexes on your columns if you want any sort of performance on this sort of data

Couch

  • Fast writes
  • Fast-ish reads (Pre-computed, but updated only when you read (IIRC)
  • Dynamic queries not possible, all pre-defined via map or map/reduce functions

So, basically - do you need dynamic queries over this sort of data? Is the read speed incredibly important to you? If you need dynamic queries then you'll want Raven or Mongo (For this sort of thing Couch is probably not what you are looking for anyway).

FWIW, Mongo's only use case in my opinion IS for logging, so you might have an anwer there.

like image 150
Rob Ashton Avatar answered Oct 12 '22 01:10

Rob Ashton