Mapper/Reducer 1 --> (key,value)
/ | \
/ | \
Mapper/Reducer 2 | Mapper/Reducer 4
-> (oKey,oValue) | -> (xKey, xValue)
|
|
Mapper/Reducer 3
-> (aKey, aValue)
I have a logfile, which i aggregate with MR1. The Mapper2, Mapper3, Mapper4 takes the output of MR1 as their input. Jobs are chained.
MR1 Output:
User {infos of user:[{data here},{more data},{etc}]}
..
MR2 Output:
timestamp idCount
..
MR3 Output:
timestamp loginCount
..
MR4 Output:
timestamp someCount
..
I want to combine the outputs from MR2-4 : Final output->
timestamp idCount loginCount someCount
..
..
..
Is there a way w/o Pig or Hive? I'm using Java.
You can do that with MultipleInputs see sample here
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With