Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to access Spark Web UI?

Tags:

apache-spark

I'm running a Spark application locally of 4 nodes. when I'm running my Application it displays my driver having this address 10.0.2.15:

INFO Utils: Successfully started service 'SparkUI' on port 4040.
INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://10.0.2.15:4040

at the end of running it displays :

INFO SparkUI: Stopped Spark web UI at http://10.0.2.15:4040
INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
INFO MemoryStore: MemoryStore cleared
INFO BlockManager: BlockManager stopped
INFO BlockManagerMaster: BlockManagerMaster stopped
INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
INFO SparkContext: Successfully stopped SparkContext

I tried to access the Spark Web by: 10.0.2.15:4040 but the page is inaccessible. Trying with the below address also didn't helped:

 http://localhost:18080

Using ping 10.0.2.15 the result is:

Send a request 'Ping' 10.0.2.15 with 32 bytes of data

Waiting time exceeded

Waiting time exceeded

Waiting time exceeded

Waiting time exceeded

Ping statistics for 10.0.2.15: Packages: sent = 4, received = 0, lost = 4 (100% loss)

Checked the availability of the port 4040 using netstat -a to verify which ports are available. The result is as follow:

   Active connexion:

    Active       local address        Remote address                      state

    TCP          127.0.0.1:4040      DESKTOP-FF4U.....:0                 Listening

PS.: Knowning that my code is running succesfully. What could be the reason?

like image 942
sirine Avatar asked Dec 25 '16 16:12

sirine


People also ask

What is the Spark Web UI?

Web UI (aka Application UI or webUI or Spark UI) is the web interface of a running Spark application to monitor and inspect Spark job executions in a web browser.

How do I access Spark cluster?

To run an application on the Spark cluster, simply pass the spark://IP:PORT URL of the master as to the SparkContext constructor. You can also pass an option --total-executor-cores <numCores> to control the number of cores that spark-shell uses on the cluster.

How do I change the Spark on my UI port?

You can optionally configure the cluster further by setting environment variables in conf/spark-env.sh. Create this file by starting with the conf/spark-env. sh. template, and copy it to all your worker machines for the settings to take effect.


1 Answers

INFO Utils: Successfully started service 'SparkUI' on port 4040.
INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://10.0.2.15:4040

That's how Spark reports that the web UI (which is known as SparkUI internally) is bound to the port 4040.

As long as the Spark application is up and running, you can access the web UI at http://10.0.2.15:4040.

INFO SparkUI: Stopped Spark web UI at http://10.0.2.15:4040
...
INFO SparkContext: Successfully stopped SparkContext

This is when a Spark application has finished (it does not really matter whether it finished properly or not). From now on, the web UI (at http://10.0.2.15:4040) is no longer available.

I tried to access the Spark Web by: 10.0.2.15:4040 but the page is inaccessible.

That's the expected behaviour of a Spark application. Once it's completed, 4040 (which is the default port of a web UI) is no longer available.

Trying with the below address also didn't helped: http://localhost:18080

18080 is the default port of Spark History Server. It is a separate process and may or may not be available regardless of availability of running Spark applications.

Spark History Server is completely different from a Spark application. Quoting the official Spark docs:

It is still possible to construct the UI of an application through Spark’s history server, provided that the application’s event logs exist. You can start the history server by executing:

./sbin/start-history-server.sh

This creates a web interface at http://:18080 by default, listing incomplete and completed applications and attempts.

As you could read, you have to start Spark History Server yourself to have 18080 available.

Moreover, you have to use spark.eventLog.enabled and spark.eventLog.dir configuration properties to be able to view the logs of Spark applications once they're completed their execution. Quoting the Spark official docs:

The spark jobs themselves must be configured to log events, and to log them to the same shared, writable directory. For example, if the server was configured with a log directory of hdfs://namenode/shared/spark-logs, then the client-side options would be:

spark.eventLog.enabled true
spark.eventLog.dir hdfs://namenode/shared/spark-logs
like image 152
Jacek Laskowski Avatar answered Oct 29 '22 14:10

Jacek Laskowski