I'm trying to practice some data mining algorithms using hadoop. Can I do this with HDFS alone, or do I need to use the sub-projects like hive/hbase/pig?
I've found a university site with some exercises and solutions for MapReduce that build only on Hadoop:
http://www.umiacs.umd.edu/~jimmylin/Cloud9/docs/index.html
Additionally there are courses from Yahoo and Google:
http://developer.yahoo.com/hadoop/tutorial/
http://code.google.com/edu/parallel/index.html
All these courses work on plain Hadoop, to answer your question.
Start with plain mapreduce at beginner level. You can try Pig/Hive/Hbase at the next level.
You will not be able appreciate Pig/Hive/Hbase unless you struggle enough to use plain map reduce
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With