I'm running Spark jobs on AWS Glue and I see the URLs to the YARN Web UI (the tracking URLs).
I'm not clear on how I can create a proxy to view that tracking site, which I'm hoping to use to find my way to the Spark UI to monitor the progress of my jobs.
Is there a way to accomplish this, like I would do for EMR?
Click Analytics > Spark Analytics > Open the Spark Application Monitoring Page. Click Monitor > Workloads, and then click the Spark tab. This page displays the user names of the clusters that you are authorized to monitor and the number of applications that are currently running in each cluster.
To access these parameters reliably in your ETL script, specify them by name using AWS Glue's getResolvedOptionsfunction and then access them from the resulting dictionary. Make data easy with Helical Insight. Helical Insight is world's best open source business intelligence tool.
Open the AWS Glue console at https://console.aws.amazon.com/glue/ . In the upper-right corner, choose User preferences. Open the Monitoring options. In the Spark UI tab, choose Enable.
You can enable the Spark UI using the AWS Glue console or the AWS Command Line Interface (AWS CLI). When you enable the Spark UI, AWS Glue ETL jobs and Spark applications on AWS Glue development endpoints can persist Spark event logs to a location that you specify in Amazon Simple Storage Service (Amazon S3).
What is AWS Glue? You can use the Apache Spark web UI to monitor and debug AWS Glue ETL jobs running on the AWS Glue job system, and also Spark applications running on AWS Glue development endpoints. The Spark UI enables you to check the following for each job:
You can use the Apache Spark web UI to monitor and debug AWS Glue ETL jobs running on the AWS Glue job system. You can configure the Spark UI using the AWS Glue console or the AWS Command Line Interface (AWS CLI). Follow these steps to configure the Spark UI using the AWS Management Console.
AWS Glue is a fully managed ETL service to load large amounts of datasets from various sources for analytics and data processing with Apache Spark ETL jobs. In this post I will discuss the use of AWS Glue Job Bookmarks feature in the following architecture.
Support for SparkUi in Glue is finally added :-) - Release date: 19-Sept-2019
Check https://docs.aws.amazon.com/glue/latest/dg/monitor-spark-ui.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With