I'm running Python scripts (and tests) with PySpark and want to remove irrelevant informations from logs.
Every time I launch those, I have the following message showing up in console :
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
How can I remove it completly ? (ideally in log4j.properties)
I have log4j.rootCategory=ERROR, console set in log4j.properties.
Doing sc.setLogLevel(newLevel) as the message says only work for the following logs and not at the beginning of the script.
Setting log4j.logger.org.apache.spark=ERROR in log4j.properties doesn't remove the message.
I have search a lot for this but can't find the relevant configuration.
From Spark Github (in Logging.scala), I can see that there is a silent variable for displaying the message but I can't find where it is changed:
if (!silent) {
  System.err.printf("Setting default log level to \"%s\".\n", replLevel)
  System.err.println("To adjust logging level use sc.setLogLevel(newLevel). " +
    "For SparkR, use setLogLevel(newLevel).")
}
Thanks in advance for any help,
I found a solution !
Just before the code I cited from Logging.scala, there is:
if (replLevel != rootLogger.getEffectiveLevel()) {
  if (!silent) {
     ...
  }
}
Which means that instead of trying to change the silent variable, it is possible to set log levels for repl and root logger to be differents in log4j.properties to achieve the same result:
log4j.rootCategory=WARN, console
log4j.logger.org.apache.spark.repl.Main=ERROR
You can also add log4j.logger.org.apache.spark=ERROR to remove other warnings from Spark that might show up.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With