Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Comparing Apache Livy with spark-jobserver

I know Apache Livy is the rest interface for interacting with spark from anywhere. So what is the benefits of using Apache Livy instead of spark-jobserver. What are the drawbacks of spark-jobserver for which Livy is used as an alternative. And I couldn't find much on this on the internet. Can you please help me to get clarity on this.

Thanks,

like image 716
user118 Avatar asked Feb 18 '18 18:02

user118


People also ask

What is Livy for Spark?

Livy is a Spark service that allows local and remote applications to interact with Apache Spark over an open source REST interface. You can use Livy to submit and manage Spark jobs on a cluster. Livy extends Spark capabilities, offering additional multi-tenancy and security features.

What is Livy cloudera?

Livy is an open source Apache licensed REST web service for managing long running Spark Contexts and submitting Spark jobs. It is a joint development effort by Cloudera and Microsoft.

What is Livy API?

Apache Livy is a service that enables easy interaction with a Spark cluster over a REST interface. It enables easy submission of Spark jobs or snippets of Spark code, synchronous or asynchronous result retrieval, as well as Spark Context management, all via a simple REST interface or an RPC client library.


1 Answers

There are a couple of major differences that were relevant to my use case.

Livy's advantages:

  • Livy does not require any changes to your code, while SJS jobs must extend a specific class.
  • Livy allows submitting code snippets as well as precompiled jars, while SJS only accepts jars.
  • In addition to REST Livy has a Java and Scala APIs. A Python API is in development, SJS has a "python binding"

SJS Advantages:

  • SJS can manage the jars as well. It allows you to upload and store Jars, then deploy jobs from these jars with a separate REST call. Livy requires the jar whenever you need to deploy a job.
  • SJS jobs can be configured with HOCON format which can be submitted as part of the REST call.

Additionally, SJS has better documentation, although in both cases, it's not comprehensive. And of course, keep in mind that both projects are pre v1, so things could change quickly.

In my case we ended up going with SJS since I had no use for submitting snippets, and Jar management and HOCOCN configuration came in handy. I am, however, considering revisiting Livy in the near future for a more thorough evaluation.

Sources:

  • https://livy.incubator.apache.org/
  • https://github.com/spark-jobserver/spark-jobserver
  • Livy presentation: https://www.youtube.com/watch?v=C_3iEf_KNv8
like image 142
W Almir Avatar answered Sep 19 '22 00:09

W Almir