I'm looking for some general information about how other people are using Hadoop or other MapReduce-like technologies. In general, I am curious to whether you are writing MR applications to process existing data sets (like web server log files), or are you writing applications that generate and process new data sets?
Edit: Follow-up Questions
(1) Do you ever execute a MR program against data generated by other MR programs?
(2) Do you ever need to modify existing data sets using MR?
(3) Do you ever share your data sets with other developers?
Checkout the PowerdBy Hadoop wiki for examples of everything from Facebook to FOX News and how they are using it.
I am analyzing existing data sets, in my case traces of programmer activity.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With