Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can set the default spark logging level?

I launch pyspark applications from pycharm on my own workstation, to a 8 node cluster. This cluster also has settings encoded in spark-defaults.conf and spark-env.sh

This is how I obtain my spark context variable.

spark = SparkSession \
        .builder \
        .master("spark://stcpgrnlp06p.options-it.com:7087") \
        .appName(__SPARK_APP_NAME__) \
        .config("spark.executor.memory", "50g") \
        .config("spark.eventlog.enabled", "true") \
        .config("spark.eventlog.dir", r"/net/share/grid/bin/spark/UAT/SparkLogs/") \
        .config("spark.cores.max", 128) \
        .config("spark.sql.crossJoin.enabled", "True") \
        .config("spark.executor.extraLibraryPath","/net/share/grid/bin/spark/UAT/bin/vertica-jdbc-8.0.0-0.jar") \
        .config("spark.serializer", "org.apache.spark.serializer.KryoSerializer") \
        .config("spark.logConf", "true") \
        .getOrCreate()

    sc = spark.sparkContext
    sc.setLogLevel("INFO")

I want to see the effective config that is being used in my log. This line

        .config("spark.logConf", "true") \

should cause the spark api to log its effective config to the log as INFO, but the default log level is set to WARN, and as such I don't see any messages.

setting this line

sc.setLogLevel("INFO")

shows INFO messages going forward, but its too late by then.

How can I set the default logging level that spark starts with?

like image 736
ThatDataGuy Avatar asked Nov 15 '16 11:11

ThatDataGuy


People also ask

How do I enable Spark logs?

If you want to enable rolling logging for Spark executors, add the following options to spark-daemon-defaults. conf. Enable rolling logging with 3 log files retained before deletion. The log files are broken up by size with a maximum size of 50,000 bytes.


1 Answers

you can also update the log level programmatically like below, get hold of spark object from JVM and do like below

    def update_spark_log_level(self, log_level='info'):
        self.spark.sparkContext.setLogLevel(log_level)
        log4j = self.spark._jvm.org.apache.log4j
        logger = log4j.LogManager.getLogger("my custom Log Level")
        return logger;


use:

logger = update_spark_log_level('debug')
logger.info('you log message')

feel free to comment if you need more details

like image 169
Suresh Avatar answered Sep 21 '22 14:09

Suresh