Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

why Hadoop is not a real-time platform

i just started to learn Hadoop and have gone through some sites and i often found that

"Hadoop is not a real-time platform" even in SO also

I mess with this and i really cant understand about it . Can any one help me and explain me about this?

Thanks all

like image 510
backtrack Avatar asked Oct 28 '13 05:10

backtrack


1 Answers

Hadoop was initially designed for batch processing. That means, take a large dataset in input all at once, process it, and write a large output. The very concept of MapReduce is geared towards batch and not real-time. But to be honest, this was only the case at Hadoop's beginning, and now you have plenty of opportunities to use Hadoop in a more real-time way.

First I think it's important to define what you mean by real-time. It could be that you're interested in stream processing, or could also be that you want to run queries on your data that return results in real-time.

For stream processing on Hadoop, natively Hadoop won't provide you with this kind of capabilities, but you can integrate some other projects with Hadoop easily:

  • Storm-YARN allows you to use Storm on your Hadoop cluster via YARN.
  • Spark integrates with HDFS to allow you to process streaming data in real-time.

For real-time queries there are also several projects which use Hadoop:

  • Impala from Cloudera uses HDFS but bypasses MapReduce altogether because there's too much overhead otherwise.
  • Apache Drill is another project that integrates with Hadoop to provide real-time query capabilities.
  • The Stinger project aims to make Hive itself more real-time.

There are probably other projects that would fit into the list of "Making Hadoop real-time", but these are the most well-known ones.

So as you can see, Hadoop is going more and more towards the direction of real-time and, even if it wasn't designed for that, you have plenty of opportunities to extend it for real-time purposes.

like image 120
Charles Menguy Avatar answered Oct 11 '22 13:10

Charles Menguy