Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

spark-submit, how to specify log4j.properties

In spark-submit, how to specify log4j.properties ?

Here is my script. I have tried all of combinations and even just use one local node. but looks like the log4j.properties is not loaded, all debug level info was dumped.

current_dir=/tmp
DRIVER_JAVA_OPTIONS="-Dlog4j.configuration=file://${current_dir}/log4j.properties "

spark-submit \
--conf "spark.driver.extraClassPath=$current_dir/lib/*"  \
--conf "spark.driver.extraJavaOptions=-Djava.security.krb5.conf=${current_dir}/config/krb5.conf -Djava.security.auth.login.config=${current_dir}/config/mssqldriver.conf" \
--conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=file://${curent_dir}/log4j.properties " \
--class "my.AppMain" \
--files ${current_dir}/log4j.properties \
--master local[1] \
--driver-java-options "$DRIVER_JAVA_OPTIONS" \
--num-executors 4 \
--driver-memory 16g \
--executor-cores 10 \
--executor-memory 6g \
$current_dir/my-app-SNAPSHOT-assembly.jar

log4j properties:

log4j.rootCategory=INFO, console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n

log4j.additivity.org=false

log4j.logger.org=WARN
parquet.hadoop=WARN
log4j.logger.com.barcap.eddi=WARN
log4j.logger.com.barcap.mercury=WARN
log4j.logger.yarn=WARN
log4j.logger.io.netty=WARN
log4j.logger.Remoting=WARN   
log4j.logger.org.apache.hadoop=ERROR

# this disables the table creation logging which is so verbose
log4j.logger.hive.ql.parse.ParseDriver=WARN

# this disables pagination nonsense when running in combined mode
log4j.logger.com.barcap.risk.webservice.servlet.PaginationFactory=WARN
like image 749
user1615666 Avatar asked Feb 14 '17 15:02

user1615666


People also ask

Where is log4j properties file located Spark?

example-spark/src/main/resources/log4j. properties.


1 Answers

Pay attention the Spark worker is not your Java application, so you can't use a log4j.properties file from the class-path.

To understand how Spark on YARN will read a log4j.properties file, you can use the log4j.debug=true flag:

spark.executor.extraJavaOptions=-Dlog4j.debug=true

Most of the time, the error is that the file is not found/available from the worker YARN container. There is a very useful Spark directive that allows to share file: --files.

--files "./log4j.properties"

This will make this file available from all your driver/workers. Add Java extra options:

-Dlog4j.configuration=log4j.properties

Et voilà!

log4j: Using URL [file:/var/log/ambari-server/hadoop/yarn/local/usercache/hdfs/appcache/application_1524817715596_3370/container_e52_1524817715596_3370_01_000002/log4j.properties] for automatic log4j configuration.
like image 133
Thomas Decaux Avatar answered Sep 22 '22 09:09

Thomas Decaux