Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

hadoop beginners question

I've read some documentation about hadoop and seen the impressive results. I get the bigger picture but am finding it hard whether it would fit our setup. Question isnt programming related but I'm eager to get opinion of people who currently work with hadoop and how it would fit our setup:

  • We use Oracle for backend
  • Java (Struts2/Servlets/iBatis) for frontend
  • Nightly we get data which needs to be summarized. this runs as a batch process (takes 5 hours)

We are looking for a way to cut those 5 hours to a shorter time.

Where would hadoop fit into this picture? Can we still continue to use Oracle even after hadoop?

like image 563
Omnipresent Avatar asked Mar 19 '10 23:03

Omnipresent


1 Answers

The chances are you can dramatically reduce the elapsed time of that batch process with some straightforward tuning. I offer this analysis on the simple basis of past experience. Batch processes tend to be written very poorly, precisely because they are autonomous and so don't have irate users demanding better response times.

Certainly I don't think it makes any sense at all to invest a lot of time and energy re-implementing our application in a new technology - no matter how fresh and cool it may be - until we have exhausted the capabilities of our current architecture.

If you want some specific advice on how to tune your batch query, well that would be a new question.

like image 197
APC Avatar answered Sep 18 '22 21:09

APC