I got a Java web application which receives some real-time events and pushes them to the user interface layer. I want to log all the perceived events and since the volume of information will be huge, I prefer using a NoSQL db.
I have setup a mongodb for this purpose which inserts a document per event. The problem is that this approach (a disk access per event) slows down the whole process dramatically.
So, what approaches can I take in this situation? what options are available in mongodb for this (e.g. bulk inserting, async inserting, caching, ...)? would switching to some other NoSQL db implementation make a difference? what are the best practices here?
SQL databases only offer the familiar tables tied together with foreign keys. But with NoSQL, you can use whatever data model makes the most sense for the job at hand. You can explore relationships between data points with graph databases, or you can use key-value databases to represent data as simpler key-value pairs.
You need some kind of analytics to understand what your employees and your customers are doing. Analytics has been part of business for a while, but with the 24/7 nature of modern business you need real-time analytics. And if you’re looking for real-time analytics, an in-Hadoop NoSQL database will help support your data needs. Why NoSQL?
NoSQL databases do not care whether there is a duplication of data because storage is not an issue with NoSQL databases. Data in NoSQL databases are typically stored in a way that is optimized for queries. This means you can store data in the same way as you would require it after performing a query
Firebase is one good example of a real-time database. Firebase is a kind of NoSQL database that allows you to store and sync data between your users in real-time. A database in real-time is a database system that utilizes real-time processing to manage workloads whose state is constantly changing.
I have waited for some time to see other answers, but lose my patience. I have used MongoDB as a log storage for 3 projects (two for Java and one for C#). Basing on this I can figure out following important rules to organize logging:
Don't use indexes. If you mostly write then indexes cause performance degradation. If you need post-process log analyzes copy information to another database or collection. Unfortunately you cannot get rid of primary key _id
- just leave it as is (GUID) or replace with auto-increment NumberLong
.
Lower write-concern. MongoDB has rich options to control awareness of write operations. You can set matching between LogLevel and writing rules. For example DEBUG
, INFO
, WARN
can go with WriteConcern.UNACKNOWLEDGED and ERROR
, FATAL
can be stored with WriteConcern.ACKNOWLEDGED. Such way you improve application performance by avoiding pause during low-priority messages writing. The same time you are sure that important messages (that are seldom) placed to storage.
Cache you collection instance. I mean avoid resolving Mongo's objects over getDB
or getCollection
each time when message arrives.
Minify amount of data passed by network. Restrict your message by minimal set of fields. Truncate too long stack trace. Look how Spring 3.x shortens full name of class s.w.s.m.m.a.RequestMappingHandlerMapping
instead of some.whatever.sub.main.minimal.agent.RequestMappingHandlerMapping
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With