Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is apache zeppelin? [closed]

As we are hearing often about apache zeppelin, So few questions comes to our mind:

  1. What is Apache zeppelin?
  2. What new and/or extra it is adding to Big data ecosystem?
  3. Is it a replacement of some of the framework(s)/tool(s) already existing in Big data ecosystem?
like image 709
Farooque Avatar asked Jun 08 '16 04:06

Farooque


People also ask

What is Apache Zeppelin used for?

Apache Zeppelin is a new and incubating multi-purposed web-based notebook which brings data ingestion, data exploration, visualization, sharing and collaboration features to Hadoop and Spark.

Is Zeppelin open source?

100% Opensource Apache Zeppelin has a very active development community.

What is Zeppelin code?

Zeppelin is an interactive notebook. It lets you write code into a web page, execute it, and display the results in a table or graph. It also does much more as it supports markdown and JavaScript (Angular).

What is Zeppelin in AWS?

Zeppelin enables data-driven, interactive data analytics and document collaboration using a number of interpreters such as Scala (with Apache Spark), Python (with Apache Spark), Spark SQL, JDBC, Markdown, Shell and so on. Zeppelin is one of the core applications supported natively by Amazon EMR.


1 Answers

What is a note book interface ?

An interface for interactively running code, exploring and visualizing data. They allow you to mix narrative, rich media and data.


Short Answer : Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.

Long answer :

  1. Zeppelin notebook gives you an easy, straightforward way to execute arbitrary code in a web notebook. You can execute Scala, SQL, and even schedule a job (via cron) to run at a regular interval.

  2. First it's easier to mix languages in the same notebook. You can do some SQL, scala, then markdown to document it all together. You can also easily convert your notebook into a presentation style - for maybe presenting to a management or using in dashboards.

  3. The Jupyter (formerly known as IPython) Notebook that has been extremely popular in the Python community. I cant use the word "replace" rather I would use similar kind of...

Further more .

  • Zeppelin supports Spark, PySpark, Spark R, Spark SQL with dependency loader.

  • Zeppelin lets you connect any JDBC data sources seamlessly. Postgresql, Mysql, MariaDB, Redshift, Apache Hive and so on.

  • Python is supported with Matplotlib, Conda, Pandas SQL and PySpark integrations.

like image 91
Ram Ghadiyaram Avatar answered Sep 29 '22 10:09

Ram Ghadiyaram