Which is better for log analysis

Question

I have to analyze Gzip compressed log files which are stored on a production server using Hadoop related tools .

I can't decide on how to do that, and what to use, here are some of the methods i thought about using (Feel free to recommend something else):

Flume
Kafka
Map reduce

Before i could do anything, i need to get the compressed files from the production server and process them then push them into Apache HBase

Marko Bonaci · Accepted Answer

Depending on the size of your logs (assuming that the computation won't fit on a single machine, i.e. requires a "big data" product), I think it might be most appropriate to go with Apache Spark. Given that you don't know much about the ecosystem it might be best to go with Databricks Cloud, which will give you a straightforward way of reading your logs from HDFS and analyzing using Spark transformations in a visual way (with a Notebook).

You can find this video on the link above.
There's a free trial so you can see how that would go and then decide.

PS I'm in no way affiliated with Databricks. Just think they have a great product, that's all :)

You can find this video on the link above.
There's a free trial so you can see how that would go and then decide.

PS I'm in no way affiliated with Databricks. Just think they have a great product, that's all :)

Which is better for log analysis

Tags:

apache-spark

hadoop

mapreduce

apache-storm

flume

Yaswanth

1 Answers

Marko Bonaci

Recent Activity

Donate For Us

Which is better for log analysis

Tags:

apache-spark

hadoop

mapreduce

apache-storm

flume

Yaswanth

1 Answers

Marko Bonaci

Related questions

Recent Activity

Donate For Us