I am working on a project that is logging a lot of information about viewers from an online streaming platform. The problem today with the MySQL solution is that is too slow to query, and such. Even with scaling and better performance tuning, that will now work because there are just to much data real time thats write/reads.
What will be a good(the best) NoSQL solution for me?
Extra:
Not exactly a NoSQL solution , but have you looked at Scribe (from Facebook)? You can use http://code.google.com/p/scribe-log4j/ to write from Java
I would spend some time looking at these options:
All of these solutions have their pros and cons, but their wikis should provide enough information to get you started.
The first challenge you may have is how to collect huge amount of data reliably with ease of management. There're some open-source log collector implementation such as syslog, Fluentd, Scribe, and Flume :)
The big problem is how to store and process data. As you pointed out, using NoSQL solution works really well, but you need to choose among them depending on your data volume.
At first, you can use MongoDB to store all of your data, but at some moment you end up using Apache Hadoop to architect a massively scalable architecture.
The poing here is you should have a distributed logging layer which abstracts away the storage backend, and choosing the right NoSQL solution for data volume.
Here're some links to put the Apache Logs into MongoDB, or Hadoop HDFS by Fluentd.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With