Logo Questions Linux Laravel Mysql Ubuntu Git Menu

How to report JMX from Spark Streaming on EC2 to VisualVM?

I have been trying to get a Spark Streaming job, running on a EC2 instance to report to VisualVM using JMX.

As of now I have the following config file:






And I start the spark streaming job like this: (the -D bits I have added afterwards in the hopes of getting remote access to the ec2's jmx)


spark/bin/spark-submit --class my.class.StarterApp --master local --deploy-mode client \
  project-1.0-SNAPSHOT.jar \
    -Dcom.sun.management.jmxremote \
    -Dcom.sun.management.jmxremote.port=54321 \
    -Dcom.sun.management.jmxremote.authenticate=false \
like image 870
Havnar Avatar asked Sep 30 '22 04:09


1 Answers

There are two issues with the spark-submit command line:

  1. local - you must not run Spark Standalone with local master URL because there will be no threads to run your computations (jobs) and you've got two, i.e. one for a receiver and another for the driver. You should see the following WARN in the logs:

WARN StreamingContext: spark.master should be set as local[n], n > 1 in local mode if you have receivers to get data, otherwise Spark jobs will not get resources to process the received data.

  1. -D options are not picked up by the JVM as they're given after the Spark Streaming application and effectively became its command-line arguments. Put them before project-1.0-SNAPSHOT.jar and start over (you have to fix the above issue first!)
like image 136
Jacek Laskowski Avatar answered Oct 08 '22 12:10

Jacek Laskowski