I want to self-learn Hadoop and Amazon Web Services online. Are there any good university courses or tutorials on the web? I could find books on Amazon on Hadoop or AWS but I want something hands on to try out and learn.
P.S. I went through the Yahoo Hadoop tutorial which was very useful.
Running Hadoop on AWS Amazon EMR is a managed service that lets you process and analyze large datasets using the latest versions of big data processing frameworks such as Apache Hadoop, Spark, HBase, and Presto on fully customizable clusters. Easy to use: You can launch an Amazon EMR cluster in minutes.
Hadoop 101 As opposed to AWS EMR, which is a cloud platform, Hadoop is a data storage and analytics program developed by Apache. You can think of it this way: if AWS EMR is an entire car, then Hadoop is akin to the engine.
After all, a large number of Internet companies still use Apache Hadoop (at their scale, only the open-source version can be used).
We're pleased to announce that Amazon Simple Storage Service (Amazon S3) Access Points can now be used in Apache Hadoop 3.3. 2 and any framework consuming the S3A connector or relying on the Hadoop Distributed File System (such as Apache Spark, Apache Hive, Apache Pig, and Apache Flink).
For hadoop, there is an awesome talk on Hadoop's ecosystem on AWS and EMR by the AWS team : http://www.youtube.com/watch?v=hrRUAvKVfxw
Then there are a series of videos tutorial they have on EMR training:
The Cloudera team supports deployment to AWS, and provide a whole range of documentation. Try searching for Cloudera and AWS on google, and read the documentation here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With