So, I've been looking at Hadoop with keen interest, and to be honest I'm fascinated, things don't get much cooler.
My only minor issue is I'm a C# developer and it's in Java.
It's not that I don't understand the Java as much as I'm looking for the Hadoop.net or NHadoop or the .NET project that embraces the Google MapReduce approach. Does anyone know of one?
1. Apache Spark. Hailed as the de-facto successor to the already popular Hadoop, Apache Spark is used as a computational engine for Hadoop data. Unlike Hadoop, Spark provides an increase in computational speed and offers full support for the various applications that the tool offers.
Or, is it dead altogether? In reality, Apache Hadoop is not dead, and many organizations are still using it as a robust data analytics solution. One key indicator is that all major cloud providers are actively supporting Apache Hadoop clusters in their respective platforms.
Hadoop is open-source that provides space for large datasets, and it is stored on groups of software with similarities. Hadoop is a project of Apache, and it is used by different users also supported by a large community for the contribution of codes.
Microsoft is contributing to HadoopServices like Azure Data Lake Analytics and the largest internal data lake now run on Apache Hadoop and YARN.
Have you looked at using Hadoop's streaming?
I use it in python all the time :-).
I'm starting to see that the heterogeneous approach is often the best and it looks like other folks are doing the same.
If you look at projects like protocol-buffers or facebook's thrift you see that sometimes it's just best to use an app written in another language and build the glue in the language of your preference.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With