I have a jar file compiled in scala 2.12 and now I want to run it on emr 5.29.0. How do I run them as the default version of emr 5.29.0 is scala 2.11.
We've found great success using popular open source frameworks like Spark and MLlib to learn models at massive scale. The advantages of using these tools are further amplified by relying on AWS and EMR, specifically, to create and manage our clusters.
To submit a Spark step using the consoleOpen the Amazon EMR console at https://console.aws.amazon.com/elasticmapreduce/ . In the Cluster List, choose the name of your cluster. Scroll to the Steps section and expand it, then choose Add step.
You can use the Scala Shell by following the procedure below. Log in to the master node using SSH as described in Connect to the master node using SSH. In Amazon EMR version 5.5.0 and later, you can use the following command to start a Yarn cluster for the Scala Shell with one TaskManager.
The Scala version you should use depends on the version of Spark installed on your cluster. For example, EMR Release 5.30.1 uses Spark 2.4.5, which is built with Scala 2.11. If your cluster uses EMR version 5.30.1, use Spark dependencies for Scala 2.11.
For example, EMR Release 5.30.1 uses Spark 2.4.5, which is built with Scala 2.11. If your cluster uses EMR version 5.30.1, use Spark dependencies for Scala 2.11.
This tutorial is for Spark developper’s who don’t have any knowledge on Amazon Web Services and want to learn an easy and quick way to run a Spark job on Amazon EMR.
As per this thread in AWS Forum, all Spark versions on EMR are built with Scala 2.11 as it's the stable version:
On EMR, Spark is built with Scala-2.11.x, which is currently the stable version. As per- https://spark.apache.org/releases/spark-release-2-4-0.html , Scala-2.12 is still under experimental support. Our service team is already aware of this feature request, and they shall be adding Scala-2.12.0 support in coming releases, once it becomes stable.
So you'll have to wait until they add support on future EMR releases or you may want to build a Spark with Scala 2.12 and install it on EMR. See Building and Deploying Custom Applications with Apache Bigtop and Amazon EMR and Building a Spark Distribution for EMR.
Since Release 6.0.0, Scala 2.12 can be used with Spark on EMR:
Changes, Enhancements, and Resolved Issues
Scala
Scala 2.12 is used with Apache Spark and Apache Livy.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With