Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Adding external jars in EMR Notebooks

I use EMR Notebook connected to EMR cluster. Kernel is Spark and language is Scala. I need some jars that are located in S3 bucket. How can I add jars?

In case of 'spark-shell' it's easy:

spark-shell --jars "s3://some/path/file.jar, s3://some/path/faile2.jar"

Also in scala console I can do

:require s3://some/path/file.jar

like image 954
Droll80 Avatar asked Aug 13 '19 08:08

Droll80


2 Answers

Just put that on your first paragraph:

%%configure -f
{
    "conf": {
        "spark.jars": "s3://YOUR_BUCKET/YOUR_DRIVER.jar"
    }
}
like image 80
Igor Tavares Avatar answered Oct 19 '22 03:10

Igor Tavares


After you start the notebook, you can do this in a cell:

%%configure -f
{
"conf": {"spark.jars.packages": "com.jsuereth:scala-arm_2.11:2.0,ml.combust.bundle:bundle-ml_2.11:0.13.0,com.databricks:dbutils-api_2.11:0.0.3"},

"jars": [
        "//path to external downloaded jars"
    ],

}
like image 22
partha_devArch Avatar answered Oct 19 '22 03:10

partha_devArch