Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hadoop MapReduce InputFormat Deprecated?

I need to implement a custom (service) input source for a Hadoop MapReduce app. I google'd and SO'd and found that one way to proceed is to implement a custom InputFormat. Is that correct?

Apparently according to http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapred/InputFormat.html InputFormat's methods getRecordReader() and getSplits() are deprecated. What's the replacement?

Hadoop's WordCount example still uses the same...

like image 623
Sri Avatar asked Apr 07 '26 02:04

Sri


2 Answers

Although Hadoop still uses things from the mapred package internally, from the user's perspective, they should pretty much all be considered deprecated. Hadoop is extremely lacking when it comes to documentation and their examples all tend to be outdated. Luckily, when you're really stuck there's always stackoverflow

like image 63
dspyz Avatar answered Apr 10 '26 03:04

dspyz


What happened is, in 0.20 they deprecated mapred classes and introduced a new API. However, new API lacked few core features, and thus old API was 'undeprecated' in the latest release. It is advisable to use old API as most likely it will be the one that is here to stay.

like image 45
Alex N. Avatar answered Apr 10 '26 02:04

Alex N.



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!