Specifically any Open Source implementations at any degree of usefulness in the following languages:
1) C++
2) Python
3) Ruby
4) C#
Hadoop framework is written in Java language, but it is entirely possible for Hadoop programs to be coded in Python or C++ language. This implies that data architects don't have to learn Java if they are familiar with Python.
The Hadoop framework itself is mostly written in the Java programming language, with some native code in C and command line utilities written as shell scripts.
Apache Hadoop is an open source, Java-based software platform that manages data processing and storage for big data applications. The platform works by distributing Hadoop big data and analytics jobs across nodes in a computing cluster, breaking them down into smaller workloads that can be run in parallel.
Data fragments in Hadoop can be too large and can create bottlenecks. Thus, it is slower than Spark. Spark is much faster as it uses MLib for computations and has in-memory processing. Hadoop has a slower performance as it uses disk for storage and depends upon disk read and write operations.
The german wikipedia has some software examples for each language. I'm translating:
Source
For Python there is Disco from Nokia: http://discoproject.org/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With