what are the options for hadoop on scala

Tags:

We are starting a big-data based analytic project and we are considering to adopt scala (typesafe stack). I would like to know the various scala API's/projects which are available to do hadoop , map reduce programs.

848

asked Jan 30 '13 04:01

prassee

1 Answers

Definitely check out Scalding. Speaking as a user and occasional contributor, I've found it to be a very useful tool. The Scalding API is also meant to be very compatible with the standard Scala collections API. Just as you can call flatMap, map, or groupBy on normal collections, you can do the same on scalding Pipes, which you can imagine as a distributed List of tuples. There's also a typed version of the API which provides stronger type-safety guarantees. I haven't used Scoobi, but the API seems similar to what they have.

Additionally, there are a few other benefits:

Scalding is heavily used in production at Twitter and has been battle-tested on Twitter-scale datasets.
It has several active contributors both inside and outside Twitter that are committed to making it great.
It is interoperable with your existing Cascading jobs.
In addition to the Typed API, it has a a Fields API which may be more familiar to users of R and data-frame frameworks.
It provides a robust Matrix Library.

answered Nov 13 '22 08:11

arkajit

Related questions
                            
                                iOS push notifications behavior when app is deleted and then reinstalled
                            
                                How do you append an element to a list in place in Prolog?
                            
                                Apple Retina Display Support in Java JDK 1.7 for AWT / Swing
                            
                                serving static files with restify (node.js)
                            
                                jQuery animate function equivalent in pure JavaScript
                            
                                Replace a textNode with HTML text in Javascript?
                            
                                How to use geo-based push notifications on iOS?
                            
                                windows azure website load time
                            
                                ElasticSearch geo distance filter with multiple locations in array - possible?
                            
                                How to select columns programmatically in a data.table?
                            
                                Declaring variables in Ruby?
                            
                                Non-const reference bound to temporary, Visual Studio bug?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With